Speech recognition software

I started this week’s article three days earlier than normal with the expectation that I would write using speech recognition software.  The idea of actually using the technology I would be writing about seemed apropos and certainly appealed to my geek side.  However, as I’ll explain, the results certainly weren’t what I had expected.

Over the years I’ve been asked to setup computers in preparation of Dragon Naturally Speaking software installations.  Dragon is considered the premier voice recognition software on the market and is probably the most popular.  Someone like myself is supposed to be able to sit down in front of their computer and microphone and simply speak commands and text.

This week at work I was asked to setup another office machine for dictating purposes.  Although I have had a hand in implementing the software I hadn’t actually used it.  The customer asked me for some information about the actual use of it and I couldn’t comment.  Having been put in my place I decided this week I was going to write my article using the built in Windows 7 Speech Recognition utility.

My first task was picking a computer and microphone combination that would actually work for the software.  It wasn’t as easy a job as I had thought.  It turns out my handheld microphone wasn’t recommended because there needs to be a constant distance between it and my mouth.  After hunting around for the right microphone I settled on my laptop’s built in unit.

It took me nearly an hour of recording “testing, testing, testing 1, 2, 3” before recording and playback levels were correct.  The next trick was making sure there was absolutely no background noise in the background.  Zeroing out background noise is a tough task with five kids in a three bedroom ranch.  Add to that Halloween candy coursing through their veins and this place is crazy.

Finally everything is all set and ready to go.  My settings and correct, I have a quiet time when I can dedicate an hour to ‘train’ the software.  I was presented with incessant sentence after sentence of similar words like “father” and “further” presumably so the software can learn to distinguish my accent and intonations.  After spitting out my gumball and with a cup of hot coffee in front of me I made my way through the text.

Excited that I was never going to have to type another single word, I launched Microsoft Word by saying “start Office.”  Microsoft Office 2007 launched with no problem and I had a little icon at the top of my screen to indicate it was working.  As I spoke the start of the article into the microphone I could see my words shooting across the page.  After about five lines of text one of the kids knocked on the door yelling “Jeromy, Jeromy!”  Across my screen I could see “hear me?” twice.

Of course deleting text is as easy as backspacing eight or so spaces so I said the command “backspace” which moved it back one position and wrote the word “backspace” seven times.  It occurred to me after the second go-around that simply saying the word wasn’t good enough.  I would actually have to use the mouse, highlight the text, and manually delete it.  Yes, I can be kind of dense.

The allure of being able to type as fast as the spoken word is the appeal of speech recognition software.  I’ll bet that most who use it successfully have been trained to talk like newscasters and are probably from the Plains States with little or no discernable accent.  To get the software to recognize my voice and speech pattern correctly I found myself talking somewhat slowly and really paying attention to how words came out of my mouth.  I may not be able to speak clear enough to be recognized but I can certainly type more accurately.  I think I’ll stick to typing anyday.

(Jeromy Patriquin is the President of Laptop & Computer Repair, Inc. located at 509 Main St. in Gardner.  You can e-mail him at remoquin@gmail.com or call him directly at (978) 919-8059.)