|
Spoken and Multimodal Dialog Technology and Systems 8:30 - 11:45 AM, 18 May, 2002 |
|
| Presenters: | Mazin Rahim and Alex Acero |
| Abstract: |
Voice technologies provide opportunities for a new user interface to information access devices where a keyboard and/or screen may not be available. Whether using a cellular phone or a hand-held device, consumers will one day be able to access services and retrieve any information at anytime and anywhere through voice, gesture or a combination of the two modalities. The goals of this tutorial are the following:
The tutorial will include demonstrations and videos drawn from business and consumer scenarios to illustrate the process of building spoken dialogue and multimodal applications using speech recognition, natural language dialogue and text-to-speech synthesis technologies. Agenda:
|
|
About the presenters: |
Mazin Rahim received the B.Eng. and Ph.D. degrees from the University of Liverpool, England, in 1987 and 1991, respectively. He joined AT&T Bell Labs in 1990 as a consultant in the area of articulatory speech synthesis. In 1991, he was appointed a research professor at Rutgers University, NJ, where he was engaged in research in the area of neural networks for speech and speaker recognition. He joined Bell Labs in 1993 as a technical staff member pursuing research in the areas of robustness, acoustic modeling and utterance verification for automatic speech recognition. Dr. Rahim is currently a division manager in the Speech Processing Center at AT&T Labs- Research. The major focus of his division is the advancement of AT&T's technologies in areas of interactive speech and multimodal user interfaces. This includes fundamental, forward looking research in robustness, acoustic and language modeling, multimodal and spoken language dialog. Dr. Rahim has over fifty publications in the areas of speech and dialog and is the author of the book "Artificial Neural Networks for Speech Analysis/Synthesis" (London: Chapman and Hall, 1994). He holds 10 US patents and is a recipient of several national and international awards. Dr. Rahim is a senior member of the Institute of Electrical and Electronics Engineers (IEEE). He was an associate editor for the IEEE Transactions on Speech and Audio Processing from 1995 to 1999, and a Chair of the 1999 workshop on Automatic Speech Recognition and Understanding, ASRU'99. He is currently a member of the IEEE Speech Technical Committee.
|
Alex Acero received a Masters degree from the Polytechnic University of Madrid (Spain) in 1985, a Masters from Rice University (Houston, TX) in 1987 and a PhD from Carnegie Mellon (Pittsburgh, PA), all in Electrical Engineering. He joined Apple Computer in 1990 where he worked on the Plaintalk speech recognition system for the Macintosh. In 1991, he joined Telefonica R&D labs, where he was the manager of the speech technology group, working on speech recognition, synthesis and telephony integration for interactive voice response systems. In 1994 he joined Microsoft Research where he is currently senior researcher and manager of the speech technology group. Dr. Acero is an affiliate professor at the University of Washington. The speech technology group at Microsoft Research has contributed speech technology (both recognition and synthesis) to several Microsoft products including Office XP, Windows XP and the SAPI/SDK programming environment. The major focus of this group is to make speech an important modality of an application’s user interface, and as such has developed MiPad, one of the first speech-centric multimodal applications for handheld devices. Current research includes robustness, acoustic and language modeling, multimodal technologies and spoken language dialog. Dr. Acero is the author of the books "Spoken Language Processing" (Prentice Hall, 2001) and Acoustical and Environmental Robustness in Automatic Speech Recognition (Kluwer, 1993), as well as edited chapters in 3 other books. He has over 50 publications in the areas of speech and dialog and holds 7 US patents. Dr. Acero is a senior member of the Institute of Electrical and Electronics Engineers (IEEE) and chair of the IEEE Signal Processing Society’s Speech Technical Committee. He was general co-chair of the 2001 IEEE workshop on Automatic Speech Recognition and Understanding (ASRU 2001), sponsorship chair of ASRU ’99 and publications chair of ICASSP98. He is associate editor for Computer, Speech and Language (Academic Press, UK) and reviewer for numerous conferences and journals.
|
|