Talking Head Videos & Captions

I’m questioning the W3C’s Web Content Accessibility Guideline 1.0 on captioning when it comes to “talking head” videos. I think the transcript of the video in XHTML format is more useful than the caption. In fact, I think captioning conflicts with the guideline to make sure users have control of time-based multimedia. Ever tried to read the captions flying by on the screen? Talk about a time sensitive flow of information!

  • WCAG 1.0 Checkpoint 1.4

    For any time-based multimedia presentation (e.g., a movie or animation), synchronize equivalent alternatives (e.g., captions or auditory descriptions of the visual track) with the presentation.

  • WCAG 1.0 Guideline 7.

    Ensure user control of time-sensitive content changes.

My recommendation for providing equivalent alternatives would be:
Priority 1

  • provide the transcript of the presentation in XHTML
  • provide synchronized captions with the presentation when the transcript alone does not adequately convey the meaning

For most “talking head” videos (a video where you just see the head of the person speaking), the transcript provided in XHTML gives the user far more access to the content of what was said in the video. Just like there is an art to writing alt, there is an art to captioning. Trust us with the decision to determine when a video needs to be captioned versus when it just needs to be transcribed!

Confession: I’ve been forcing myself to transcribe/caption multimedia content recently, rather than finding some other innocent victim to do it for me. I consider it penance for recent inaccessible content I’ve been party to. But, to my delight, I’ve learned some valuable lessons and even had a bit of fun. I never said I was normal.

So compadres, what do you think?


  1. I would probably go for the simple option. Not because it’s necessarily the best, but mainly because a transcript is an easier option, and more likely to be usable/accessible for a wide audience. Sure, there is an art to transcripts too (for example, stressing words, adding in pauses, etc), but I feel more confident doing that than /proper/ captioning. Alright, I’m just being lazy! (But still accessible)

  2. You know, another benefit to transcripts is that content is more available to search engines than with captions. But you can’t provide transcripts for live webcasts, hmmm? If you are Deaf (or without speakers / headphones) the streaming video of say, the local City Council meeting or other informational panel, is meaningless without captions. And fortunately there are services that can provide this. “Live Captioning” is the process of simultaneously creating and transmitting captions for live video. I think they use court reporting techniques or something to make it work, and I’m not sure what the costs are, but if used, the archival material is captioned as well.

Comments are closed.