Special Applications of Automatic Speech Recognition (ASR) with Deaf and Hard-of-Hearing People: Part IIby Ross Stuckless1997 Symposium on ASR
I once heard a deaf person say wistfully that he longed for a "little black box" he could carry around in his pocket to enable him to become more independent in his communication with hearing people. To date, widespread use of automatic speech recognition (ASR) in classrooms and other group settings in which deaf or hard-of-hearing people are participants, as discussed in Part I of this article ( NTID Research Bulletin,2(3), 1997), has been constrained by the need for a third-party operator. However, in 1997 we came a step closer to that "little black box" when two new ASR products came on the market, both capable of processing large vocabulary, speaker-adaptive, continuous speech. Complexity of recognizing natural language
The first commercial applications of ASR were directed toward a more controlled form of language and not toward the more free-flowing form we commonly use in spoken conversation. Naturally-spoken language is much more difficult to recognize automatically than the formal, carefully organized language we use in dictating or reading aloud from text, resulting in a considerably higher error rate for the former. In his presentation, Michael Picheny (1997) talked about dysfluencies common to spontaneous speech--the "um's," "ah's" and "you know's," the false starts and the restarts, all of which complicate the task of speech recognition. Early generation continuous speech recognition
Mark Mandel (1977) of Dragon Systems extemporaneously demonstrated the pre-release alpha version of his company's first generation continuous speech product, NaturallySpeaking, Personal Edition. It appears in the sidebar on page 7 without correction (errors are underlined). The speaker added punctuation by voice as he spoke. Four errors appeared in this 129 word continuous speech production, making it better than 95% error-free and quite readable. A second version, NaturallySpeaking Deluxe, has since come on the market with several refinements. What's next in speech recognition?
Michael Picheny (1997), an expert in ASR research, suggested that "the biggest research challenge over the next couple of years will be to come up with models for handling rapid conversational speech." He also talked about other priorities, including the need to deal with recognition problems posed by dysfluencies and accents in speech, background noise, and telephone characteristics, e.g., narrow bandwidth. Personal thoughts from the symposium
Like others who participated in the symposium and/or have read its proceedings, I came away with new information and expectations. First, we need to proactively encourage major ASR developers to recognize deaf and hard-of-hearing people as members of a potential niche market. We should also take the initiative ourselves in adapting new systems and devices as needed. Conclusion
If I had a disappointment, it was that we had no time to delve more deeply into some topics and to open others. As a former teacher of deaf children, I wanted discussion about ASR's potential for English language learning, both at home and in school. As an educational researcher, I wanted discussion about how ASR might assist deaf and hard-of-hearing students in mainstreamed classes. And as a faculty member in a career-oriented college and university, I wanted to explore some thoughts about how ASR could be adapted to communication needs in the workplace. Another time... References
Allen, J. (1997). Applications of automatic speech recognition to natural language and conversational speech. In R. Stuckless (Ed.) Frank W. Lovejoy Symposium on Applications of Automatic Speech Recognition with Deaf and Hard-of-Hearing people (pp. 33-39). Rochester, NY: Rochester Institute of Technology. To obtain a copy of the proceedings from the Frank W. LoveJoy Symposium on Applications of Automatic Speech Recognition with Deaf and Hard of Hearing People, contact Ross Stuckless at ERSNVD@RIT.EDU and type "ASR Proceedings" on the subject line. Also, you can review and download the proceedings in their entirety at http://www.rit.edu/~ewcncp/Lovejoy.html |
| Home | Staff | Events | Publications | Site Map | ||
| 52 Lomb Memorial Drive, Rochester, New York 14623-5604, (585) 475-6433 (Voice/TTY), (585) 475-7660 (Fax) pepnetnortheast@rit.edu | ||