My research on virtual humans was initiated in 1997 when I was awarded a three year “Challenge Grant” of $1.8 million from the National Science Foundation (with Dominic Massaro & Alex Waibel as co-Principal Investigators). The goal of this grant was to develop and integrate computer speech recognition, speech synthesis and character animation technologies into the CSLU Toolkit, and to use the toolkit to design applications to teach speech and language skills to students at the Tucker Maxon School in Portland Oregon. The research produced a number of articles, and more importantly, significant and lasting benefits to the students who used the program (and to the many students who continue to use the program today).
Looking back on this wonderful project, it seems to me that a guardian angel must have guided our efforts. I was at CSLU in 1996, working on the CSLU Toolkit, which by then integrated computer speech recognition (Developed at CSLU) and the Festival speech synthesis system (developed by Paul Taylor and Alan Black at University of Edinburgh). Using the CSLU Toolkit’s Rapid Application Developer, or RAD, a graphical user interface for designing spoken dialogs, it was possible to develop a number of sophisticated applications, such as conversing with the system to retrieve weather forecasts from a Web site. Also in 1996 I reconnected with Dom Massaro who invented the Baldi system with Michael Cohen. Baldi is a 3D talking head with very accurate lips. I had not seen Dom in over 25 years, when I was grad student at UC Riverside and Dom was a Postdoc at UC San Diego. Shortly after I read the NSF Challenge Grant program announcement, I ran into a colleague at a supermarket (that I had never gone to before) who used to work at the Oregon Graduate Institute. kathy told me she had left OGI and was now working at the Tucker Maxon Oral School (Now the Tucker Maxon School), a school that used an oral approach to instruction and language training for students with profound hearing loss. At that moment, the proverbial light bulb went on in my mind. I asked her if she thought Tucker Maxon might like to partner on an NSF grant proposal to develop a talking head that would converse with their students to teach speech and language skills. Dom was as excited as I was, we submitted the proposal, and to our great surprise, it was awarded. The research was conducted between 1997-2000.
On March 15, 2001, ABV TV’s Prime Time Thursday featured a segment in which Baldi, the 3D talking head invented by Dom Massaro and Michael Cohen at UC Santa Cruz, was shown helping children learn new vocabulary and dramatically improving their speech recognition and production skills. Prime Time introduced the segment with the words “This is what a small miracle looks like.” The National Science Foundation also featured the Baldi project on the NSF home page during March and April 2001. While the Baldi project brought great benefits and joy to many children, and continues to benefit children with sensory and cognitive disabilities through the efforts of Animated Speech Corporation, which has licensed the technology and extended its capabilities, the premier of the Prime Time segment was overshadowed by the tragic death of Mike Macon, an exceptional speech researcher who invented the voice of Baldi, who died the evening of the broadcast. After moving to CU and establishing the Center for Spoken Language Research, I applied for an NSF ITR grant to develop perceptive animated agents that could be used as virtual tutors and therapists. This grant was awarded, and the technologies developed with support from this and other grants (e.g., the SONIC speech recognition system, the CU Animate system) led to subsequent grants / projects resulting in virtual tutoring and therapy programs.