Dr.-Ing. Christian Hacker   Dr.-Ing. Christian Hacker

-Assessment of non-native speakers
-Children's Speech
-Focus of Attention
-User States
-Speech Recognition

My research topics in the field of automatic speech recognition

Click here for infos about my doctoral thesis.

Pronunciation scoring of foreign language learners

Is it possible that a computer judges whether the pronunciation of words and sentences in a foreign language is good enough? Can a computer help us while learning a new language, can it give us useful hints, can it even listen to us and assess us during any conversation in real life? Can a computer replace teachers to train children?

Read more about this.

Variability in children's speech

Classifiers are trained on a large set of data, mainly from adult speakers. Does the accuracy of a classification system depend on the age of the speaker? What acoustic variability can by observed for different age groups like young children, children in school age, or elderly people? Does this variability infuence accuracy of speech recognition systems or the classification rate of paralinguistic characteristics of the voice like the goodness of the pronunciation or sentiment analysis?

Reed more about assessment of language learners.
Read more about characteristics of children speech.

Multimodality and fusion of sensor data

Is it possible to classify the focus of a speaker, i.e. to whom somebody is speaking? Is she/he speaking to a human or to a machine like a robot or a handheld device? Can we recognize this from the prosody, i.e how someone is speaking? How much can we recognize from audio, how much from visual information? How could we do a multimodal fusion?

Read more about user focus detection.

Classification of emotional user states

Is it possible to analyse the sentiment of a speaker from the audio signal? What emotions and uer states occur in what kind of data? Is prosody sufficient to recognice user states? What user states can be found in dialog systems, what in robot interaction? What emotions can be observed by users of an interaction system and what about children?

Read more about classification of user states.

Speech recognition and acoustic feature extraction

What features are extrated from the acoustic signal to feed a speech recognizer? What is the cepstrum, what are traps? Feature extraction was designed for classical speech recognition systems based on small amuont of training data, whereas recurrent neural networks can nowadays deal also with the acoustic spectrum or the audio signal itself. Recognizers have changed from HMM based architectures with continuous or semi-continuous codebooks to neural network based architectures.

Read more about acoustic feature extraction and speech recognition.

Speech dialog modeling

How can dialog systems react on user input? How can we extract semantic knowledge about the intent and about slots from the output of the speech recognizer when the output is an unseen or gramamtically erroneous phrase? How can we ask the user for missing information to fill all required slots? How can we calculate the reaction of a system to a user input?

[42] Hacker, Christian; Sowa, Timo; Weilhammer, Karl; Springer, Volker; Massonie, Dominique; Ranzenberger, Thomas; Gallwitz, Florian
Interacting with Robots - Tooling and Framework for Advanced Speech User Interfaces In: Elektronische Sprachsignalverarbeitung (ESSV) 2017, Tagungsband der 28. Konferenz, Studientexte zur Sprachkommunikation 86, Thelem Universitätsverlag, Dresden, ISBN 978-3-95908-094-1

[41] Massonie, Dominique; Hacker, Christian; Sowa, Timo
Modeling Graphical and Speech User Interfaces with Widgets and Spidgets In: ITG-Fachbericht 252: Speech Communication (24.-26.09.2014, Erlangen), Berlin/Offenbach : VDE Verlag GmbH 2014, ISBN 978-3-8007-3640-9

Read more about current developments at Elektrobit Automotive GmbH here. Find also videos about speech dialog modelling, automotive products including speech recognition, image video showing us at work, and human robot interaction including speech recognition.

Android game with voice control

Are there any games available using speech recognition? It is easy to integrate automatic speech recognition into Android applications. VoiceWalker is an example where you need to find the right key words to succeed.

Read more about VoiceWalker here.

Occupation of the author

Dr. Christian Hacker worked for more than five years as a researcher in the field of automatic speech recognition at the Pattern Recognition Lab (Computer Science Department 5) of the Friedrich-Alexander University Erlangen-Nuremberg. Click here for further infos.
Since 2008 he is responsible to develop speech dialog solutions at Elektrobit Automotive GmbH.