Wednesday, 16 November 2016

YouTube & Dragon NaturallySpeaking Transcription Trial

In my last post, I wrote about my trial of VoiceBase, an online transcription AI.

In that post I said that I was unsure if VoiceBase performed any better than YouTube or Dragon Naturally-Speaking.

So I decided to do a YouTube trial using the same file that I uploaded to  VoiceBase. For this, I had to convert my mp3 file to an mp4 file, then upload to my YouTube account. I did so, and YouTube's AI duly created a text transcription of my file (see image).

For Dragon NaturallySpeaking however, I had to restate what I had recorded, so despite having listened to the recording a few times to try to make my phrasing as similar as I could, it is possible that my test is not absolutely the same.

Below are the results, with the incorrectly recorded words in red (my original words can be viewed here):

What Dragon thought I said


Hello there
 
this is samyoung and I'm recording my voice in order to do a trial on voice space.


I'm just going to do a couple of minutes exploring how well my speech is transcribed using this software


I'm going to do a couple of test quotes in order to see how well this comes through

this is a quote from Reginald Hill from his book arms and the women  1999 page 67

is this really so important to me I've got to say it is this potentially so interesting to readers, they'll have to read it


now a couple of pieces of not prose  but technical writing why can't I get help from this program [cut off point]
What YouTube thought I said


Hello the


this is same Yang and i am recording my voice in order to do a trial on voice space.


I'm just going to do a couple of minutes exploring how well my speech is transcribed using this software.


i'm going to do a couple of teeth quotes in order to see how well this comes through

this is a quote from regional hill from his book [...] and the woman 1999 page 67


is this really so important to me I've got to say it is this potentially so interesting to readers they'll have to read it
now a couple of pieces of technical writing why can't I get help from this program [cut off point]


Dragon NaturallySpeaking was the clear leader with three errors, eight lost capitals and no 'ear' for punctuation; with YouTube's AI in second place, with eight errors - one being a lost word, eight lost capitals, and no punctuation.

VoiceBase had 29 errors - two being added words, three being lost words, three lost capitals, and a better punctuation 'ear', generally hearing where full stops should and capitals for new sentences should be included. 

The really interesting thing is that my test was only a minute long, and the amount of errors contained in that minute. VoiceBase has an error rate of 1/2 seconds; Dragon with 1/20 seconds, and YouTube with 1/7.5 seconds; not counting capitals or other punctuation. If we add capitals, the error rates are: VoiceBase 1/1.88 seconds; Dragon with 1/5.45 seconds, and YouTube with 1/3.75 seconds. If we add errors for missing full stop punctuation only, the error rates are: VoiceBase (+2) 1/1.76 seconds; Dragon (+6) 1/4 seconds, and YouTube (+7) 1/3.33 seconds.

The winner - after 'training' by the user - is Dragon NaturallySpeaking. However, this software generally costs around USD$100

The best freeware option for transcription is YouTube.


Sam

No comments :

Post a Comment