How do You Solve the Accessible Video and Audio Challenge?

After a decade of working in the field of web accessibility, I still hear people who think that creating accessibile multimedia for the web is too expensive and too hard to do. I’ll admit, I was initially overwhelmed by the challenges. But today, I know that accessible multimedia is doable and smart business. You just need to know how to be super efficient with your resources and how to prioritize. And while I know a lot about this topic, it is evolving and I can always learn more from the accessibility tribe.

So, on Monday, June 18th, the amazing Elle Waters (@nethermind) and I (@goodwitch) hosted an #a11ychat “Making Audio and Video Accessible to All” and we were joined by our colleagues to share what we know. You can read the transcript or chirpstory. Here are some of my favorite lessons learned:

Captioning Strategy

To be effective, your organization needs to adopt a captioning strategy. @jared_w_smith of WebAim and I have both seen that outsourcing captioning to firms like CaptionSync is incredibly cost effective. I continue to keep an eye on automated solutions, like Dragon NaturallySpeaking. But to date, the error rates on speech recognition software are too high to make this method the most cost effective. Let me put that last statement in context. Speech recognition software can be trained to your particular voice…and when it is trained, the accuracy rates can be very, very good. So, if you have a single voice speaking on the audio/video and the speech recognition software is trained to that voice, software like Dragon NaturallySpeaking can be very cost effective. But, when you are trying to use speech recognition software and any of the following are present, your results are going to need a significant amount of manual correction: more than one voice, an accent, background noise.

My rule of thumb? If the error rate of the speech recognition software is greater than 3%, it is more cost effective to outsource the transcription/captioning to professionals than it is to manually correct the automated transcript in-house. It takes more time (and/or costs more money) to review and edit the mistakes than it would for a trained transcriber to create the transcript from scratch. There is an interesting and effective method for using speech recognition software called “re-voicing”. Some transcription houses use this method to create their transcripts. Using the re-voicing method, a trained transcriber listens to the multimedia to be captioned and re-speaks all dialogue, in their own voice, directly into the speech recognition software. And, this speech recognition software has been trained to their particular voice. Because the re-voicer is a trained transcriber, they know to describe any relevant sounds beyond pure dialogue, as well as other transcription methodologies that insure efficient, quality content.

Acceptable Error Rate

What is an acceptable error rate in the transcript / caption file? The error rate should not exceed 1% and in reality should be at or below 0.5%. If you think I’m being a perfectionist, I’m really not. Ask anyone in the field of professional transcription and you will get a very similar answer. Need to know why an error rate of 3% is not acceptable? Check out this interesting post on “When Does 1 Error = 5 Errors“.

Social Justice

The message is crystal clear. Captioning of multimedia on the web is a civil right. It is already a legal requirement under the Americans with Disabilities Act (ADA). Court case after court case has shown that the web is a place of public accommodation, just like a physical store, convention center, motel, museum, library, school, zoo, gym, movie theatre… (the list goes on). Keep a close eye on the Netflix lawsuit and you will quickly realize that captioning is a legal requirement.

I’m not saying you have to caption every bit of multimedia you have on the web before the end of the day today. But I am saying, you need to recognize that captioning is a requirement. Be smart and draft a captioning strategy for your organization. Consider outsourcing to a firm that specializing in transcription and captions. Prioritize your multimedia. First caption the content that is most important and that people use the most. Make it easy for people to request captioning on an item that has not been captioned yet, and be able to produce that item in a caption form in a reasonable timeframe. Make transcripts and captions a part of new multimedia production.

I promise you, that once you’ve gotten your first few videos captioned, you are going to realize, this is doable. This is not going to break your budget. And I wager, that within a few months…you will be realizing how captions are actually benefiting your business in many ways including making your multimedia searchable.

For a real eye-opener, I recommend that you watch the movie ‘Audism Unveiled‘. I know the movie had a profound impact on me. Intellectually I realized that captioning provided equal access. But I had no idea how painful the experience of discrimination against people who are deaf could be.

You can make a difference. You can caption your multimedia. You can also join the #captionTHIS social media movement and ask for equal access for all.


  1. Hi Glenda.

    Really enjoyed this article.

    From an Australian perspective, I believe that captioning is poised to enhance our lives dramatically in so many ways as we embrace the National Broadband Network (NBN) which is being rolled out to 93% of premises in Australia.
    i.e. it’s not just access for Deaf and people with a hearing loss, but it’s also useful for searching for information, helping people who don’t speak English as their first language and in noisy environments, etc.

    But I do worry that the gap is already to big to rein in – i.e. the oft-quoted figure of less than 1% of video content being captioned….

    Your thoughts?

    Is it really getting better (going to get better?) on the internet in captioning / accessibility terms?

  2. Michael, I do think it is getting better. I like to imagine what it was like before captioning was required on TV. If that volume of captioning could be solved..I think we can solve it on the internet as well. I also think of multimedia content in two categories: professional/business content and personal content. My focus is on getting professional/business content captioned.

    Indeed…captioning does result in a win for everyone. The media is so much more searchable once it is captioned.

Comments are closed.