The AST website will be rebranded soon.

Webinar Series: Improving Captioning Quality

By: Kevin Erler, Ph.D.

Popular posts

business learning accessibility
A Guide to Accommodating Employees Who are Deaf A Guide to Accommodating Employees Who are Deaf
Image of a woman smiling and looking at her laptop.
Automatic Sync Technologies Has Been Acquired by Verbit Automatic Sync Technologies Has Been Acquired by Verbit

Related posts

Webinar: Audio Description Webinar: Audio Description
Webinar: Using Cloud Storage for Accessible Video Webinar: Using Cloud Storage for Accessible Video

In this webinar we cover the best practices for improving captioning quality for videos that you caption offline using a closed captioning service.

Note: The following video should be considered an alternative to the Annotated Transcript, which contains descriptions of visual references in the media. Also, the pages listed in the Resources section are primarily text-based, and will be useful to those who do not have access to the visual content.

Annotated Video Transcript

Hello. In today’s webinar I will explore what actions you can take to ensure you get the best quality caption results back from your captioning vendor. I realize that by outsourcing your captioning work, you would like to just send your videos out and have them come back captioned perfectly. While we certainly aim to make that happen, there are definitely some jobs we get that are challenging to get right – and there are some actions you can take to improve the accuracy of your captions – which is something we all want.

First, let me introduce myself. My name is Kevin Erler – I am one of the co-founders of AST and have been actively working on captioning technology and services for over 12 years now.

As this entire talk is focused on how to improve the quality of your caption results, I thought I’d better start by defending the premise that quality does matter. In captioning, quality and accuracy don’t just matter, they are critical. First and foremost, you are providing captioning to ensure your content is accessible to those that do not have access to the audio content – those folks are relying on the idea that the captioning is a true and accurate representation of the audio track. If there are errors, then they are not getting fair and equal access to the content. It is also your institution’s reputation that is at stake – just as you careful to avoid publishing errors in printed material with your seal on it, captioning errors also reflect poorly on the institution. Finally, we are strong proponents of using captioning data for more than accessibility – using it to drive search and indexing features, drive the ability to access supplemental information, drive the ability to access content in other languages … but all of these features are premised on the idea that the captions are accurate in the first place. They won’t be very impressive – or useful – if the captions are wrong.

To get the best results, you need to start with the best ingredients. True for cooking; true for captioning. The obvious starting point is choosing a vendor that uses professional transcribers to generate the transcript, but that isn’t the whole story. Here are some samples of audio that we’ve received in just the past couple days – even professionals will struggle with these.

[ Example of unintelligible audio recording ]

[ Another example of difficult audio ]

I think you have to admit, those are a challenge. So what can you do to make the transcribers’ job manageable? Well, the first thing to know is that audio quality does matter – it is very difficult to get a good transcript from a lousy audio file. A good thing to keep in mind is this: if our transcribers can’t make out what was said, using specialized transcription equipment and high quality headphones, then your viewers won’t fare much better. The number one thing that affects audio quality is proper mic’ing. Mic proximity is much more important than just getting an expensive mic – a $100 wireless lapel mic does a far better job than a $1000 room mic – which is probably picking up room echo, audience noises, paper shuffling, chairs moving, and air conditioner noise. While lapel mics do a great job of capturing a good recording of the speaker and eliminating background noise, be aware that secondary speakers (such as questions from the audience) will not be captured. Plan for this, either by having a portable mic for audience members to use when asking questions, or by having the primary speaker repeat those questions. As much as we think that we speak in complete sentences, most presenters talk in sentence fragments. Many speakers are concerned when they see a transcript of their presentation because the sentences appear incomplete or disorganized … this is why even some of the best speakers use a script or at least fairly complete speaking notes. If you have presenters like this, encourage them to consider writing out what they want to say.

Once you get good audio, you have increased the odds of a good transcript tremendously. But if your content is specialized or uses uncommon terminology, it can still be a challenge to get those terms correct. If the professor has a glossary of terms for the class, by all means please provide it to us. A link to the course webpage could be all that we need. Some folks give us just the audio track to caption – and technically that is all that we need, but for difficult content, submitting the video provides our transcribers with extra information that can help them get a better transcript. Finally, if you have extra content such as notes or other documents that you cannot attach to your submission, you can always email it to along with the job ID and our support folks will make sure it gets to the right place.

CaptionSync has a number of features to help you help us. I’d like to take a moment here and point some of those features out to you. First, opening up the “Transcriber Guidance” box on the submission page allows you to put in text notes that the transcriber can use. Terminology, a link to the course webpage, or special instructions can go here [In the Transcriber Guidance field of the Captioning Submissions page].

CaptionSync also allows you to add a “persistent note” to the transcriber – this is a note that will automatically get attached to all of your submissions, so you don’t have to keep making the same notes if you have standing instructions. You can access Persistent Notes from the Settings tab on your account.

CaptionSync also allows you to narrow the transcription pool if you know your content needs specific expertise. This is particularly useful for content that is technical in nature and requires special training for the transcriber such as medical content, or economics, chemistry or math material. We suggest you use this feature sparingly – when you reduce the pool of potential transcribers, your content may queue up waiting for an appropriate transcribers so it may take longer than the advertised turnaround time. All of our transcribers are professional transcribers, so selecting a specialized expertise will not get you a better transcriber – just one that has experience in the subject area you select.

If your content is very specialized, you may want to review the transcript before we generate the caption files. Selecting this option allows you to check over the transcript or have a subject-matter expert check over the transcript, and make any adjustments that you want before we create the captions.

Finally, CaptionSync also allows you to choose which spelling standards you want us to use. This feature is mostly for the benefit of our international clients, but there may be specific cases where you want to choose Canadian, UK, or Australian spelling conventions.

Inevitably, some errors will get through. So what do you do when you discover a problem after the caption files are all done? If the changes are minor and just involve changing some words in the transcript, we recommend you use the “redo” feature. This allows you to just edit the transcript, make your changes, and immediately regenerate the caption files. It takes just a couple of minutes and does not cost any extra to do. Redos can be done as many times as you need for 6 months from the original submission date. The Online Caption Editor is a slightly more sophisticated way to accomplish the same thing – it allows you to view the captions in a live player and alter them as you see them. Both the redo and editor features can be accessed by going to the Status section of your account and selecting the job you want to modify.

If the problems are more extensive and you would like us to review or repair them, then click on the Help link from your account and go to our Support Center. Here you can open a ticket – give us the job ID and explain the problem and we’ll take it from there.

The biggest challenge to getting high quality captions is getting the transcript right. If you already have the transcript – because you are working from a script or because you have a subject-matter expert producing the transcripts, then please feel free to use them. Make sure you follow our very simple transcript format guidelines – we’ll give you a link at the end of this presentation. If your audio is poor or if you do not want to spend the time getting your caption files to come out perfect, use our “Result Review” service which will send your transcript over to a human reviewer who will mark it up properly to ensure it works well with our system. Note that you can add Result Review after-the-fact, so if you are not sure if you need it, try without and add it after if you can’t get it all to work the way you want it to.

Ok, I have just a few closing thoughts on this topic that I would like to leave you with before we open up for questions. We realize that you cannot always do anything about the quality of the audio – sometimes you get handed a recording that is complete and there is no opportunity to re-record it, but you still need to caption it. Because we hold our transcribers to quality metrics, if they feel they are not able to generate a good quality transcript for you, they will likely reject your job. Our transcription manager will try hard to find a suitable transcriber for your job, but if too many of them are unable to generate a transcript, the job will eventually get rejected. There is no cost for a rejected job … but a rejected job also still leaves you with no captions. If you really need to get a caption file and you know that you have a poor audio file, then put something along the lines of “use best effort to get a transcript” in the Guidance field. When we see this, we will not use the job for quality metrics and our transcribers will just do the best job they can with your file. If they really can’t make out what is being said, it may still get rejected, but if they can make out most of it, they will give it a try. Finally, note that the Turnaround Time that you select also has an impact on rejections. It takes a fair bit of time to produce an accurate transcript, and even more time if the audio is difficult. If you are submitting poor audio and asking for short turnaround times, it increases the likelihood of a rejection; consider using a slower turnaround for poor audio files as it will give the transcriber more time to work on it.

Finally, here are some links to resources that expand on what I covered today; you will also find these links on the webpage for this webinar.