Master 2017 2018
Stages de la spécialité SAR
ASR Based Speech Intelligibility Prediction from Binaural Signals


Site : OFFIS e.V. - Institut für Informatik, R&D Division Health
Lieu : OFFIS e.V. - Institut für Informatik R&D Division Health Escherweg 2 - 26121 Oldenburg - Germany
Encadrant : Benjamin Cauchi Scientific Researcher, Automation and Integration Technology Group
Dates :du 15/02/2018 au 15/08/2018
Rémunération :450€ per month (subject to German regulations)
Mots-clés : Parcours ATIAM : Acoustique, Parcours ATIAM : Informatique musicale, Parcours ATIAM : Traitement du signal

Description

Accuracy of ASR systems and the speech intelligibility perceived by a human listener are defined in a similar way, i.e., as the ratio between the number of words correctly identified and the number of words present in the target signal. Additionaly, it is well known that binaural cues have a large impact on speech intelligibility [1] and that preprocessing using spatial characteristics of the signal, e.g. beamforming, can greatly improve the performance of ASR systems [2]. Speech enhancement, which aim at improving speech intelligibility, and preprocessing, which aim at improving ASR accuracy are often based on similar processing schemes and approaches designed for one application can be exploited by the other. For example, features used by ASR systems are often derived from psychoacoustic models [3] while recent approaches have used ASR systems to predict speech intelligibility [4].

The aim of this internship is to develop an ASR based predictor of the speech intelligibility taking into account the impact of binaural cues. After getting familiar with the literature, you will integrate standard spatial features into an existing ASR framework and apply the developed ASR tools using a provided database of binaural signals. The same database will be used to conduct speech intelligibility measurements on normal hearing listeners. The developed tools and the collected scores will be used to train and test of an ASR based predictor of the speech intelligibility which takes the impact of binaural cues into account.

The work would require a solid background knowledge in audio signal processing and an interest in machine learning as well as programming experience with Matlab and background in either C or C++. A good level of English is expected for writing publications, e.g. internship report, and communicate results internally.

Please apply by email, sending a CV and a cover letter in English to :

Benjamin Cauchi Scientific Researcher, Automation and Integration Technology Group OFFIS e.V. - Institut für Informatik R&D Division Health Escherweg 2 - 26121 Oldenburg - Germany benjamin.cauchi@offis.de

Bibliographie

[1] J. Blauert, "Spatial Hearing : The Psychophysics of Human Sound Localization.", Cambridge, MA, USA : MIT Press, 1997.

[2] Results of the 4th CHiME Speech Separation and Recognition Challenge "http://spandh.dcs.shef.ac.uk/chime_..."

[3] N. Moritz et al., "An Auditory Inspired Amplitude Modulation Filter Bank for Robust Feature Extraction in Automatic Speech Recognition.", in IEEE/ACM Trans. on Audio, Speech, and Language Processing, 2015.

[4] B. Kollmeier et al., "Sentence Recognition Prediction for Hearing-impaired Listeners in Stationary and Fluctuation Noise With FADE : Empowering the Attenuation and Distortion Concept by Plomp With a Quantitative Processing Model", in Trends in Hearing, 2016.