TikTok Transcripts for Accessibility Captions

Why accessibility captions matter

When you repurpose TikTok video content for your website, blog, or learning platform, the original in-app captions do not transfer. The video file alone provides no text alternative for deaf and hard-of-hearing users — and no fallback for viewers in sound-off environments.

WCAG 2.1 Success Criterion 1.2.2 requires that prerecorded audio content in synchronized media has captions. If your site hosts repurposed TikTok clips without caption files, it fails this criterion. The solution: extract the TikTok caption track and attach it as a VTT file to your web video player.

What you need

Public TikTok video URLs with accessible caption tracks
A TokCaption account — sign up free
An HTML5 video player that supports the <track> element (most do)

Accessibility Caption Workflow

Extract

Pull the caption track

Paste the TikTok URL into TokCaption. The accessible caption data is extracted with timestamps.

Export

Download as VTT

Choose VTT export for web players or SRT for video editors. Both preserve timestamp sync.

Attach

Add to your video player

Upload the VTT file alongside your video and reference it with a <track> element for accessible playback.

Step 1: Extract the caption track

Open TikTok, copy the share link of the video you are repurposing, and paste it into TokCaption. Run the transcript job. TokCaption extracts the accessible caption track and presents the text with start and end timestamps for each segment.

No captions = no extraction. TokCaption reads the caption data embedded in the post. If the creator did not publish captions, you will need to create them manually or use a separate captioning service. TokCaption does not generate captions from audio.

Step 2: Export as VTT for web use

Click Export and select VTT (WebVTT). This format is natively supported by all modern browsers and works with the HTML5 <track> element. The exported file includes:

A WEBVTT header
Timestamp cues in HH:MM:SS.mmm --> HH:MM:SS.mmm format
Caption text for each segment

If your workflow requires SRT instead (for example, if you are editing in Premiere Pro before publishing to the web), export as SRT and convert to VTT later — or export both.

Step 3: Attach to your HTML5 video player

Upload the VTT file to your server alongside the video file. Add a <track> element inside your video player markup:

<video controls> <source src="video.mp4" type="video/mp4" /> <track kind="captions" src="captions.vtt" srclang="en" label="English" default /> </video>

The kind="captions" attribute tells assistive technology that this track provides captions for accessibility purposes, not just translation subtitles.

Reviewing captions for WCAG compliance

The raw TikTok caption track gives you an accurate starting point, but full WCAG compliance may require additional review:

Reading speed — ensure no single caption stays on screen for less than 1 second or contains more text than a viewer can read in the displayed time
Line length — keep lines under 42 characters for readability
Speaker identification — if multiple speakers appear, add labels like [Host] or [Guest] at the start of relevant lines
Non-speech audio — add descriptions for meaningful sounds like [music playing] or [applause] if they are not already in the caption track

Batch captions for multiple videos

If you are embedding an entire series of repurposed TikTok clips on your site, use the bulk transcribe workflow to extract all caption tracks in one session. Export each as an individual VTT file and attach them to their corresponding video players.

VTT vs SRT for accessibility

Use VTT when:

Publishing video on the web with HTML5 players
You need styling options (VTT supports CSS-like cue styling)
Your platform expects WebVTT format (most modern CMS and hosting platforms)

Use SRT when:

Editing in desktop video software before final publish
Your hosting platform specifically requires SRT
You plan to convert to VTT as a final step before deployment

Related guides

Frequently asked questions

Why do I need separate caption files for accessibility?

When you embed repurposed TikTok video on your website, the original TikTok captions do not carry over. You need a VTT or SRT file attached to your video player so that deaf and hard-of-hearing users can read synchronized captions.

Does TokCaption produce WCAG-compliant captions?

TokCaption exports timestamped VTT and SRT files. The timing and text structure meet the technical requirements for WCAG 1.2.2 (Captions - Prerecorded). You may need to review formatting, line length, and reading speed for full compliance.

What is the difference between captions and subtitles?

Captions include all relevant audio information (speaker identification, sound effects) for deaf and hard-of-hearing viewers. Subtitles typically contain only dialogue. TokCaption exports the caption track as published by the creator, which usually covers spoken dialogue.

Can I add speaker labels to the exported captions?

TokCaption exports the raw caption track data. If the original TikTok captions include speaker identification, it will be in the export. Otherwise, you can add speaker labels manually in a subtitle editor after export.

Ready to try it yourself?

Free account — 5 transcript jobs per day, no credit card required.

Start for Free