<video> elements must have captions

Rule ID: video-caption
Ruleset: axe-core 4.7
User Impact: Critical
Guidelines: WCAG 2.1 (A), WCAG 2.0 (A), Section 508, Trusted Tester

Learn Web Accessibility

Subscribe to our extensive curriculum of online self-paced courses

Compliance Data & Impact

User Impact


Disabilities Affected

  • Deaf


  • WCAG 2.1 (A)
  • WCAG 2.0 (A)
  • Section 508
  • Trusted Tester

WCAG Success Criteria [WCAG 2.1 (A)]

  • 1.2.2: MUST: Captions (Prerecorded)

WCAG Success Criteria [WCAG 2.0 (A)]

  • 1.2.2: MUST: Captions (Prerecorded)

Section 508 Guidelines

  • 1194.22: MUST: Web based intranet and Internet Information & Applications
  • 1194.22 (a): MUST: A text equivalent for every non-text element shall be provided (e.g., via "alt", "longdesc", or in element content)

Trusted Tester Guidelines

  • 17.A: MUST: The multimedia provides accurate captions for the audio content.

How to Fix the Problem

Ensure all video elements have a caption using the track element with caption attribute. Ensure the caption conveys all meaningful information in the video element; this includes, but is not limited to, dialogue, musical cues, and sound effects.

Good captions not only include dialogue, but also identify who is speaking and include non-speech information conveyed through sound, including meaningful sound effects.

The following code shows how to add two different tracks - one in English and one in Spanish:

<video width="300" height="200">
    <source src="myVideo.mp4" type="video/mp4">
    <track src="captions_en.vtt" kind="captions" srclang="en" label="english_captions">
    <track src="captions_es.vtt" kind="captions" srclang="es" label="spanish_captions">


Captions and subtitles are not the same thing. Captions are necessary for deaf viewers to understand the content. Captions include a text description of all important background noises and other sounds, in addition to the text of all dialog and narration. Subtitles are generally language translations, to help listeners understand content presented in a language they don't understand. A Spanish video file could contain English subtitles, for example. Subtitles generally include only dialog and narration.

Given these differences, you should specify kind="captions" for deaf access, and not kind="subtitles".

The src attribute gives the name of the track file. The kind attribute describes the contents of the file. The srclang attribute specifies the language of the track file using the appropriate HTML language code. The label attribute provides a name for the track. None of these attributes, aside from src, are required. Nevertheless, they are highly recommended because they increase clarity.

Youtube offers automatic captioning as a somewhat experimental feature. The automatic captions tend to be too inaccurate to use without some editing, but it gets rid of quite a bit of work. Another useful feature offered by YouTube is the ability to synchronize a transcript with the video automatically. You type up a transcript, upload it to YouTube, and YouTube processes the video and transcript together, using voice recognition to synchronize the transcript with the video. This feature tends to be entirely accurate. In some cases, no additional editing is necessary. In other cases, you need to tweak the timing a bit, but at least you don't have to do all the work manually.

Why it Matters

If a video has no caption, deaf users have limited or no access to the information contained in it. Even if a captions track is available, ensure that it contains all meaningful information in the video, not just dialogue.

Deaf viewers can see everything in the video but are not able to hear any of it without captions. Without a caption track, deaf viewers do not have a way of knowing the dialog, narration, or the essential sounds not spoken by people, such as "dramatic instrumental music," applause, screams, or other sounds that set the scene, provide context, or give meaning to the video.

Rule Description

An HTML5 video element must include a track element with kind="captions" set as a property. The captions should convey all meaningful auditory information in the video including dialogue, musical cues, sound effects, and other relevant information for deaf users.

The Algorithm (in simple terms)

Ensures video elements have captions.


Other Resources

You may also want to check out these other resources.

Refer to the complete list of axe 4.7 rules.

Was this information helpful?

You have already given your feedback, thank you..

Your response was as follows:

Was this information helpful?
Date/Time feedback was submitted: