Comparing Vendors? Don’t Be Misled


For most of our 15-year history we have felt that the community of accessibility practitioners that we’re a part of was relatively immune to the misleading self-promotion practices that have become so pervasive in recent years. Accessibility is the pursuit of making the world more inclusive; false information and misleading advertising just don’t belong here.

Knowing all the pros and cons of a particular service before selecting a vendor is important, but you also need to weigh factors such as ethics, honesty, and transparency when making a purchasing decision. We created this page to fact-check several false and misleading claims that have been recently made about our services.

CaptionSync™ vs. 3Play Media

In particular, we need to correct the record about several recent claims made by 3Play Media about our CaptionSync™ service:

Accuracy:   3Play claims that the accuracy rate of our service is between 93.5% and 95%. This is a complete misrepresentation of the facts.  It appears that 3Play is citing an average performance metric for their own accuracy, and comparing it to a small, specific, incognito “test” that they ran on our service. Their test videos consisted of seven highly accented speakers with poor intelligibility, and the submissions did not make use of any of our guidance on how to get the best results. In other words, the test was designed to get the lowest possible accuracy rate for AST, and then compare it to a generic average rate for 3Play.  This is not an apples-to-apples comparison, and is therefore completely misleading.

We were able to locate the test submissions they made, and we measured accuracy on them to check their statements.  On several of the videos, we measured accuracy to be above 99%; our accuracy slipped slightly (rates between 97 and 98%) on a couple of the videos that had particularly poor intelligibility, and one video was simply too unintelligible to get a good accuracy measurement for it.   One can “engineer” an assessment to come up with much lower results, but by commonly accepted standards, our results were quite good, despite 3Play’s attempts to create a poor outcome.

Here is an example of one of their “tests.” Listen to this audio, and write down what you hear, if any of it is intelligible:

Just like 3Play, our average accuracy with good audio is over 99%. We use professional, trained transcribers to achieve this, and we benchmark transcriber performance frequently.

Completeness of Transcripts: 3Play claims that we provide incomplete transcripts. This is misleading and inaccurate. This misleading claim may stem in part from a common misunderstanding about the difference between the way humans understand speech and the way machines interpret speech. Systems that use machine-generated transcripts (or hybrid systems that combine speech recognition with human reviewers) are designed so that they essentially are required to take a “guess” at every single word in an audio or video file. This leads to transcripts that are “complete” only in the sense that every word and phrase is accounted for, but they are not necessarily accurate or useful for human users of the transcripts or captions.

In contrast, as humans, we work to extract meaning from speech, and we do not force ourselves to understand every single word or phrase in the process. If part of a dialog is unintelligible, we mark it as such and move on to try to extract meaning from the other parts of the dialog that are intelligible.

AST’s transcription team consists entirely of highly-trained human transcribers. As a result, there may occasionally be words or phrases in a file that our transcribers will mark as “[ inaudible ].” They are trained to do this, because this produces captions that are more useful and intelligible to human consumers.  In short, our transcripts that contain segments marked as inaudible are not incomplete; on the contrary, this is an indication that our transcribers are doing the job they set out to do:  to accurately convey what a hearing user would be able to understand, and not to make guesses that could confuse users who are relying on the captions.

Quality Assurance: 3Play claims that our service lacks a quality assurance process, and that clients must pay $1.50 per minute extra if they want their transcript reviewed.  This is incorrect on several levels. First, unlike many newer entrants to the captioning field, CaptionSync does not start from a machine-generated or crowd-sourced transcript. As a result, our team members do not need to waste time on the laborious task of correcting errors created by machine algorithms and amateur transcribers or editors.  Thus an important part of our quality assurance process is to use high-quality inputs at every step in the process.

We do offer review services, but their purpose is not to correct the transcription errors of machines or amateur transcribers. Instead, these services are designed to assist users in perfecting the timing of captions when their video contains music, sound effects, or other “sweetening”, and to assist clients providing their own transcripts. Furthermore, these review services start at less than half of the price claimed by 3Play.

Support: 3Play claims that our support team is “unresponsive,” and touts their support policy of responding to all tickets within three business hours. AST employs a formal support infrastructure (Zendesk) to track all support requests and our responses; it provides detailed metrics on support performance — which we track carefully. As reported by our Zendesk system, our median first response time for support tickets is 1.2 business hours, and our customer support satisfaction rating is over 97% — much better than industry averages, which hover in the 60-70% range.

Users Vulnerable to Legal Action: 3Play claims that our captions fail to meet industry accuracy standards, leaving users vulnerable to legal action. AST has been in business for over 15 years, serving thousands of customers, and none of our clients have ever been subject to legal action related to the quality of our transcriptions or captions.  The same cannot be said of organizations that have attempted to replace professional human transcribers with machine-generated transcription, such as Harvard, MIT, and the University of California at Berkeley.

Audio Description: We commend 3Play on their decision to provide audio description services, which is an important emerging area in the field of video accessibility. However, we again need to correct them on specific details. They claim that CaptionSync does not provide audio files with our audio description services. We do provide audio files, along with full-text alternative descriptive transcripts, and WebVTT files with text descriptions (for implementation of WCAG 2.0 Technique H96) — all at no extra charge.

Industry Expertise: 3Play touts their industry expertise by pointing to “hundreds of free resources” on their website and blog, contrasting this with a bland statement claiming that “CaptionSync posts to their blog infrequently.” While we appreciate 3Play’s youthful exuberance and prolific use of social media, we would respectfully point out that quantity does not equate to quality, nor is it a reasonable proxy for “expertise.” Our position is that thoughtful, technically accurate documentation, combined with a collaborative approach to working with clients is a much better indicator of industry expertise than the number of Twitter followers or Facebook likes that one garners.  AST has also written hundreds of technical support articles, conducted dozens of collaborative webinars with clients, and we work hard to share our expertise with customers every day. Our team members have presented frequently at conferences, and have served as technical experts for groups needing captioning expertise.

Word-for-Word Synchronization: 3Play claims that CaptionSync does not offer word-for-word synchronization.  Word-for-word synchronization has been one of our offerings for over 15 years, starting well before 3Play was even in business. This feature of our platform has been well-documented for many years, and could have been verified with simple fact-checking.

We could go on, but you probably get the point by now. Unfortunately it is not just one vendor that is making false or exaggerated claims; other new entrants to the transcription and captioning arena have also entered the fray. We’re all for tooting your own horn and calling out your strengths, but we also believe that maintaining honesty and integrity while doing so is equally important. We sincerely hope, for the sake of the communities that we serve, that our competitors will re-join us in taking the high road as we promote video accessibility services. There are millions of people around the world who can benefit from greater access to high-quality educational video, and we all should work together to make this goal a reality.