I’m here at YouTube’s office in San Bruno, where the company is holding a press conference to discuss the launch of auto-captions. YouTube Director of Product Management Hunter Walk kicked off the event by discussing some of YouTube’s goals through the years — one of which is accessibility.
Walk said that a few years ago, accessibility meant giving users more ways to access their content (for example, through their mobile phones). Now, the company is focusing more on making its content accessible to even more people. Google software engineer Ken Harrenstein then took the stage to walk through some of YouTube’s initiatives on this front.
Harrenstein walked us through YouTube’s past feature launches, including the launch of captions and subtitles. In November of last year, the company began to roll out auto-captions on a limited scale, which use speech recognition to automatically transcribe what’s said in a video. And now, it’s going to enable the feature for all videos uploaded to YouTube where English is spoken.
This makes the videos accessible not just to deaf people, but also to viewers around the world, who can translate any video that’s in English to another language. However, Harrenstein took time to point out that the captioning isn’t perfect, showing how the words “SIM card” got transcribed to “salmon”.
Here are some of the details for uploading videos:
- While we plan to broaden the feature to include more languages in the months to come, currently, auto-captioning is only for videos where English is spoken.
- Just like any speech recognition application, auto-captions require a clearly spoken audio track. Videos with background noise or a muffled voice can’t be auto-captioned. President Obama’s speech on the recent Chilean Earthquake is a good example of the kind of audio that works for auto-captions.
- Auto-captions aren’t perfect and just like any other transcription, the owner of the video needs to check to make sure they’re accurate. In other cases, the audio file may not be good enough to generate auto-captions. But please be patient — our speech recognition technology gets better every day.
- Auto-captions should be available to everyone who’s interested in using them. We’re also working to provide auto-captions for all past user uploads that fit the above mentioned requirements. If you’re having trouble enabling them for your video, please visit our Help Center here.
Google researcher Mike Cohen then took the stage to talk about Google’s Speech Technology. The ultimate vision, he says, is to provide accurate captions for all videos in all languages. But that comes with many problems, including a massive vocabulary, issues with poor recordings and background noise, and accents. And every language comes with its own unique challenges.
YouTube hasn’t yet run all of its videos through the new transcription service, but video owners will be able to manually request that their older videos get transcribed more quickly through each video’s options screen.
Harrenstein, who is deaf, retook the stage to tell a personal story. When he was at MIT, he didn’t go to many of his lectures because he was unable to understand the lectures (which weren’t signed). Now, he can watch MIT lectures on YouTube, with captioning enabled.
Next, some students from the California School for the Deaf in Fremont, and their instructor Joey Baer, took the stage to thank YouTube for the launch. Check out their enthusiasm in the video below. Really, this is quite amazing.