Our main focus within the speech technology area is presently on automatic speech recognition, speech-centric dialogue systems, and applications for small, mobile devices. However, we are still maintaining our highly qualified competence within speech synthesis and speech compression.
Speech-centric dialogue systems are using speech as the main modality, however other modalities might be used as an alternative or to augment the functionality and improve the user experience.
Speech input/output is an important issue in order to enhance the naturalness, and improve the effectiveness of interaction between humans and digital services. A speech based user interface will contribute to the goal of equal access and usability for all, irrespective of computer literacy and disabilities. We are involved in developing large vocabulary continuous speech recognition systems for the Norwegian language and spoken dialogue systems for limited domains.
In a Norwegian speech-centric dialogue system, much of the basic technology used worldwide can be applied. However, much R&D is necessary in order to effectively utilize the language specific parts of such systems. This applies in particular to the exploitation of linguistic knowledge and semantic content in texts, dialogues, and spoken language.
A speech-centric dialogue system for the Norwegian language covers the major parts of the multidisciplinary research activities that we are involved in:
- Robust speech recognition in order to deal with environmental noise.
- Robust speech recognition in order to deal with man-made noise (e.g. speech disfluencies, non-speech sounds), dialect and speaker variations, and variations in the terminal equipment and transmission channel.
- Language understanding in order to improve recognition accuracy and extract the meaning content of the recognized text.
- Dialogue management in order to generate textual prompts for missing information, greetings, and answers to queries, information verification, and error handling.
- Speech output systems that convert the textual prompts to spoken messages.
- Implementation in small, mobile devices, and integration of services into wired or wireless, seamless networks.
- Integration of additional modalities to speech (e.g. touch screen).
In our present and future research activities we are focusing on "Design for all," i.e. that computer-based systems should be accessible for all groups in the society, including children, disabled and elderly people. This means that simple and intuitive multimodal user interfaces are important issues.
Contact:
Erik Harborg
Tel.: +47 73 59 31 39