Home / Publications / Speech Recognition / A Method for Differentiating Homophonic First Names when Using Speech Recognition Technology

A Method for Differentiating Homophonic First Names when Using Speech Recognition Technology

Copyright © Alan Cantor 2009. All rights reserved.
Presented at the SIG 11 (Computer Applications) Show-and-Tell at RESNA 2009.

Executive summary

Despite the accuracy of speech recognition technology, homophonic first names — names that are pronounced the same but spelled differently — will always be subject to error. During free-form dictation, no amount of context or data will enable the software to reliably determine whether a user means "Katherine," "Kathryn," or "Catherine;" or "Bobby," "Bobbi," or "Bobbie."

To address this difficulty, I experimented with a mnemonic (memory aid) to make homophonic names acoustically distinct. The pronunciation for each name consists of:

  1. The name
  2. "with" OR "without"
  3. "a" OR "an" OR "one" OR "two"
  4. A singular or plural letter of the alphabet, or its equivalent

Examples:

Ann = Anne without an E
Dillon = Dylan with two Ells
Greg = Greg with one Gee
Kathy = Cathy with a Kay

I have compiled a list of name-pronunciation pairs that can be imported as NaturallySpeaking custom words. The names require no training; the list is easily expanded and customized; and other mnemonics can be added or substituted. A basic name-pronunciation list is available from the author on request.


Note: With the release of Version 11 in 2011, Nuance changed how NaturallySpeaking handled custom words: spoken forms could no longer contain punctuation marks or symbols. I updated this article accordingly:

Before Version 11: Ann = "Anne without an E."

Version 11 and after: Ann = "Anne without an E"

In other words, a spoken form cannot include a period after the letter.