A Method for Differentiating Homophonic First Names when Using Speech Recognition Technology
Presented at the SIG 11 (Computer Applications) Show-and-Tell at RESNA 2009.
Executive summary
Despite the accuracy of speech recognition technology, homophonic first names — names that are pronounced the same but spelled differently — will always be subject to error. During free-form dictation, no amount of context or data will enable the software to reliably determine whether a user means "Katherine," "Kathryn," or "Catherine;" or "Bobby," "Bobbi," or "Bobbie."
To address this difficulty, I experimented with a mnemonic (memory aid) to make homophonic names acoustically distinct. The pronunciation for each name consists of:
- The name
- "with" OR "without"
- "a" OR "an" OR "one" OR "two"
- A singular or plural letter of the alphabet, or its equivalent
Examples:
Ann = Anne without an E
Dillon = Dylan with two Ells
Greg = Greg with one Gee
Kathy = Cathy with a Kay
I have compiled a list of name-pronunciation pairs that can be imported as NaturallySpeaking custom words. The names require no training; the list is easily expanded and customized; and other mnemonics can be added or substituted. A basic name-pronunciation list is available from the author on request.
Note: With the release of Version 11 in 2011, Nuance changed how NaturallySpeaking handled custom words: spoken forms could no longer contain punctuation marks or symbols. I updated this article accordingly:
Before Version 11: Ann = "Anne without an E."
Version 11 and after: Ann = "Anne without an E"
In other words, a spoken form cannot include a period after the letter.