Tuesday, July 27, 2010

Yahoo! Openhack India 2010- FlicksubZ

After a long time, one more occasion to use my blog. Yes, we(Me, Sudeep, Parashu) won "Best in Show" award in Yahoo's Openhack India 2010. Yes!! once again!! Like previous years, 24hrs coding, Beer, Food, Tea, Redbull etc etc.. One good change this time was increase in number of hackers. Yes, there were 430 hackers flooding Taj Residency.

What we did this time?

Automatic, Real-time close captioning/translation for flickr videos.


We captured the audio stream that comes out to speaker and gave as input to mic. Used Microsoft Speech API and Julius to convert the speech to text. Used a GreaseMonkey script to sync with transcription server(our local box) and video and displayed the transcribed text on the video. Before displaying the actual text on the video, based on the user's choice we translate the text and show it on video. (We used Google's Translate API for this).

Some of the speech recognition frameworks that we tried are sphinx 4.0, Windows SAPI, Julius. None of these are 100% accurate. but definitely better than just watching videos with out any captions. Have read that Nuance Dragon is really doing good in this space but its very costly.

Extension and usefulness

There are infinite number of video's on internet, we cant manually caption everything. We use this hack to auto caption it. It might not be accurate, but we can store the auto generated caption as srt(close caption standard) file and provide simple UI for users to edit/correct the captions if they think the auto generated caption is wrong. in this way the speech recognition system can train itself. Over a short period of time, by using the internet crowd, we can get a good speech recognition engine.

What all did we get as award?

1. Chris Heilmann's complement !!!
2. Certificate signed by David Filo !!!
3. XBOX 360 Elite and 3 IPod nano 4G 8gb :) :)


Bharathiraja Subramanian said...

Congratulations Srithar!!!

Ayyanar said...

Congratzzz da :) way to go..

Brandon said...


Do you have any examples of the transcribed/translated videos? (I guess a video of the video?)

Pasu said...


BabuSrithar said...

@Bharathi, @Ayyanar, @Pasu, thanks

@Brandon, http://www.youtube.com/watch?v=zaQGrK_fkD4 is a small demo on the app.

Brandon said...

Thanks for the video link. Very cool work--we're looking at elements of this, I am pretty surprised at the quality of the initial speech recognition.

I suspect you'd find though, if you used the same engine on English speakers from different backgrounds (say from any of the Indian states) that you'd get much more variable accuracy.

Nonetheless, very cool and congratulations.

Umeshnrao said...

Congratulations!!! Really liked your hack... is your hack on github ?

Richard said...

Congratulations Srithar very nice and amazing work you have done.I have seen video and very good result. Thanks for share good moment..

subtitle translation