Home / Publications / Speech Recognition / A Review of NaturallySpeaking Version 7.0

A Review of NaturallySpeaking Version 7.0

Copyright © Alan Cantor 2003. All rights reserved.
This review appeared on-line in May 2003

Scansoft released the latest version of its speech recognition software a few months ago. NaturallySpeaking 7.0 is the first major upgrade in over a year. Version 6.0 was, in many ways, a step backwards from versions 4.0 and 5.0, which were both excellent. Version 6.0 had so many programming bugs and usability bloopers that I did not recommend it to any of my clients. This upgrade has been long overdue.

I am happy to report that version 7.0 is a huge improvement over version 6.0. It's fast and accurate, and many of the serious problems introduced in the previous version have been addressed.

ScanSoft's promotional materials overstate the improvements in accuracy since version 5.0 and 6.0. As I recall, ScanSoft boasts a 15% accuracy boost over version 5.0. I am skeptical about this figure. Now as before, good accuracy is achieved through a thorough understanding of the program, having a beefy PC (e.g., a very fast processor with oodles of RAM — I consider 256 MB to be an absolute minimum for serious users), and by knowing how to prepare the system for dictation. For example, you can feed the program lists of commonly used words and expressions, provide writing samples for analysis, and perhaps most importantly, be meticulous about correcting errors during the first hours of use. It's still important to do all these things. And yet, after having used version 7.0 for a couple of months, I do find it to be slightly yet noticeably more accurate. On the other hand, if you are currently using an earlier version, and are satisfied with the accuracy you have achieved, you may have little incentive to upgrade unless you can benefit from some of the new features.

Support for dictating in and controlling applications such as Microsoft Word, Outlook, Excel, etc. is greatly improved. However, I continue to get the best results by dictating directly into DragonPad, the proprietary text editor that ships with the product. In some programs, there continue to be screens that do not respond to voice command, period. So 100% hands-free operation has not yet been achieved.

Version 7.0's Web browsing tools are excellent. An ingenious set of voice commands makes it possible to efficiently navigate the Web via Internet Explorer without mouse emulation, i.e., the notorious MouseGrid command, which is an extremely cumbersome way to accomplish most tasks. (There are almost always more efficient ways.)

In version 7.0, you can speak informally, but write formally by toggling off abbreviations and contractions. In other words, you dictate, "I can't and don't understand why it doesn't happen," and the system will write, "I cannot and do not understand why it does not happen." Cute, but of limited value.

Auto-punctuation inserts commas and periods automatically on-the-fly. This feature not helpful to me, and fortunately, it can be turned off. However, one of my assistive technology colleagues noticed that auto-punctuation worked very well for a young client with a learning disability. Auto-punctuation is definitely "cool," and whether or not it works for an individual, it is exciting because it previews what speech recognition could look like in the future when the software can detect more subtle vocal nuances.

As with any sophisticated software, the new release introduces new bugs. For example, "Switch To Previous Window" and "Switch To Next Window" no longer work properly in maximized windows. These two commands now activate the target window's system menu and "restores" it, which changes the size of the window.

Despite these minor irritations, I believe that 7.0 as the best speech recognition product on the market today. Please note that proper system configuration and training are crucial for successful implementation. Despite claims you may have heard that speech recognition software is easy and intuitive to use, nothing could be further from the truth, especially for serious writers. I find that many people require 20 to 30 hours of individual training, and lots of practice, before they become proficient users. I also find that the people who are most likely to become proficient at the program have strong language skills. I have had greater success implementing speech recognition with journalists than I have with engineers. Having an intuitive grasp of language is, for me, a strong predictor of the likelihood that a serious writer will learn to use the program to advantage.