NaturallySpeaking 5.0: Is the Upgrade Worthwhile?
Copyright © Alan Cantor 2001. All rights reserved.This article was published in Real Times. May - July 2001, No. 49, pp. 6-7. Center for Accessible Technology, Berkeley, CA.
I would like to share some of my impressions after having upgraded from version 4.00 to 5.00 of NaturallySpeaking.
Version 5.00 is the first NaturallySpeaking upgrade since L+H purchased Dragon Systems last year. Version 5.00 is a significant upgrade, and apparently represents an attempt to merge the best of VoiceXpress with the best of NaturallySpeaking. NaturallySpeaking was, in my opinion, a much better product. VoiceXpress, for example, has few editing commands, whereas NatSpeak has many ingenious commands for navigating around and revising documents. Although Version 5.00 has much in common with versions 4.00, 3.52, and 3.00, it also seems to have inherited some of the design philosophies that made VoiceXpress problematic. So while there are notable enhancements, there are also features and operating characteristics that may render the program less usable, both to people with and without disabilities.
Overall Assessment
Dragon Systems NaturallySpeaking version 5.00 is a mixed bag. Consider carefully before you upgrade, and decide whether the enhancements are worth the functionality you will sacrifice.
Accuracy
I detect no significant improvement in accuracy between the two versions. If you have achieved good accuracy using version 4.00, you may well be satisfied by what you already have. A warning: When upgrading to version 5.00, I recommend that you do not bother allowing the program to modify your old voice files for use in the new version. When I allowed NatSpeak to modify my existing voice files, accuracy was much worse than when I re-created voice files from scratch. (There are two utilities, "GetWords" and "PutWords," on Joel Gould's unofficial NaturallySpeaking web site that make it easy to transfer vocabulary and pronunciation from one voice file to another.)
Correction Dialogue
I am extremely disappointed by changes in the Correction window. In version 5.00, you can choose between two different Correction windows. (I have not yet worked with the new "Quick Correct" technique because I initially found it too restrictive.) Unfortunately, the fonts in both types of correction windows are hard-coded in a tiny, spidery, 8-point typeface, and the information in these windows is difficult to see.
In version 4.00, it was possible to adjust the Display Properties Applet in the Windows Control Panel to increase the fonts size in the correction window. (I believe that the "Message Box" item affected the appearance of the NaturallySpeaking correction dialogue.) This is a major usability/accessibility blooper. Despite the fact that I have "perfect" corrected vision, I am forced to crane my head forwards to read the correction dialogue, even with a 17-inch monitor. On my laptop, the correction window is even less legible. Furthermore, "Automatic Playback on Correction" does not work as well as in version 4.00. More often than not, playback reads an extra word at the end of the phrase or word you are correcting. Thus, visual and audible feedback do not correspond, which is confusing and can lead the user to make mistakes during correction. In the new version, the user must wait until automatic playback is complete before making corrections by voice; in version 4.00, the user did not have to wait — you could issue commands at any time. Consequently, correcting errors with automatic playback switched on can be a source of error, and I have had to disabled the automatic playback feature. In addition, after making a choice from the Correction window, the new version is more likely to ignore the rules of capitalization than the earlier version.
Mouse-free operation
There are a number of new usability problems related to the keyboard-only interface. Unlike Version 4.00, a mouse is required to perform several steps during enrollment. In fact, the entire program is somewhat more mouse intensive than in the past, and the keyboard interface is less intuitive and awkward to use than in version 4.00. The awkwardness of the keyboard-only interface can translate into ungainly voice commands. For example, in certain situations you must say something like "Press Keypad Star" to put focus to the "DragonBar," which is a floating toolbar that contains two NatSpeak menus and several (unlabelled!) toolbar buttons.
Status line information
In version 4.00, the status line in NatSpeak's proprietary editor provided useful information about recent utterances. Some of this information is no longer available. The only way to get any of it is to display the Results Box at all times. However, the Results Box is visually distracting, especially for users with certain learning disabilities. I have had to turn it off for many of the people who I have trained; personally I have never seen a need for it. In version 5.00, hiding the Results Box means that users cannot obtain as much information about the state of the program. Even with Results Box showing, the program no longer reports events such as when a "Select" command fails to find text.
Compatibility
This is the good news. Version 5.00 works more seamlessly in more applications than version 4.00. I have found that I can reliably dictate into Microsoft Word and Outlook without significant loss of accuracy or speed or features. Note, however, that version 5.00 appears to be at least as crash prone as its predecessor when dictating into applications other than the Dragon Systems proprietary editor.
Activating menus, and dialogue/toolbar buttons
More good news. It is no longer necessary to say "Click" to activate menus and dialogue buttons. There are situations, however, when not uttering "Click" can give unexpected results. For example, when inserting the word "Insert" into a document, the program will pull down the "Insert" menu! But this is a minor inconvenience, and one of the annoyances that one must accept when operating a PC by voice.
New features
There are noteworthy new features: It is now possible to create multi-line voice macros in the Preferred Edition. (In the past, you were limited to one line voice macros.) Canadian and British users will appreciate the ability to insert properly formatted postal codes directly by speaking.
Summary
If you are using version 4.00 and are happy with it, consider carefully before upgrading. For my part, I will continue experimenting with the new version, but I don't think that I will give up on version 4.00 yet, especially on my laptop. The best thing about version 5.00 is that it is compatible with more programs than earlier versions. The worst thing about version 5.00 is that correcting misrecognitions is horrendously more difficult than in the past. I consider the ease with which one can fix misrecognitions as a rough measure of the overall usability of a speech input product. The easier it is to correct misrecognized utterances, the more likely it is that a user will be able to tune the voice files for accuracy and speedy dictation.
Test Systems used: Dell PIII 550 desktop with 256 MB RAM, and a Dell PIII 800 laptop with 192 MB RAM.