Three Tips for Audio and Video Everyone Can Follow
Audio and video can carry a lot of teaching, as long as nobody is locked out of them. These three habits open them up.
Cover these and you are most of the way there:
- Add accurate captions
- Provide a transcript
- Never autoplay media
Add Captions to Audio and Video
Captions are a synchronised text version of the speech and the meaningful sounds in a recording.
For Deaf and hard-of-hearing learners they are the only way into the spoken content, and a huge share of everyone else turns them on too, on a quiet train, in a loud room, or to follow an unfamiliar accent. Automatic captions get you maybe most of the way, but they mangle names and technical terms, which is exactly the part that matters in teaching. Generate the captions, then read them through and correct them.
Some examples:
- Automatic captions on a science lecture might turn “mitosis” into “my toes is.” The error lands on exactly the term the lesson is about. A quick pass in the caption editor fixes it before it confuses anyone.
- Captions should cover the sounds that carry meaning, not only the speech. A line like “[fire alarm]” tells a Deaf viewer what a hearing viewer would notice. Without it, part of the scene simply goes missing for them.
- A guest lecturer’s name auto-captioned three different ways across one video looks careless and breaks search. Fix it once in the editor so the spelling is consistent. The credit is then correct, and learners can find that section by searching the name.
To go further on this, two references:
Provide a Transcript
A transcript is the full text of the recording, and it does work that captions cannot.
People can read it at their own pace, search it, skim it, quote it and translate it, and for audio-only material it is the one practical route in for someone who cannot hear it. It is also the part a search engine can actually read. Captions help in the moment; a transcript helps before and after, so publish one alongside the recording.
Here is what that looks like in practice:
- An audio lecture posted with a transcript lets a student find a single term with a quick search. They can jump to the part they need rather than scrubbing through the audio. For someone who cannot hear it, the transcript is the only way in at all.
- A recorded webinar published with its transcript lets people scan the content in a minute. They can decide what is worth their time before committing to the recording. It also gives anyone revising a fast way to relocate a point.
- For a 50-minute recording, a transcript lets a learner who missed five minutes find exactly those five minutes. Without it, they have to re-watch the lot to be sure. The transcript turns a long hunt into a quick search.
Two resources worth a read:
Don't Autoplay Anything
Media that starts on its own talks straight over screen readers, pulls focus, and can be genuinely disorienting.
It is a real problem for people with anxiety or vestibular conditions, and it spends data the learner never chose to spend. Handing control back costs you nothing. Let the learner decide when to start, give them a player with working play and pause controls, and make sure those controls work from the keyboard.
What that looks like day to day:
- A video set to autoplay on page load talks straight over the screen reader announcing the page. The two voices compete and neither is clear. Switch autoplay off so the learner starts the video when they are ready.
- Check that a learner can reach the play button with the Tab key. They should be able to start the video with Space or Enter, without ever touching a mouse. Keyboard control is the difference between usable and locked out for many people.
- An audio clip set to autoplay on a quiz page startles the learner and covers the screen reader reading the question. The interruption lands at the worst possible moment. Let them press play themselves, when it suits them.
Two places to dig deeper:




