Acoustic model selection using limited data for accent robust speech recognition

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Authors

Colleges, School and Institutes

Abstract

This paper investigates techniques to compensate for the effects of regional accents of British English on automatic speech recognition (ASR) performance. Given a small amount of speech from a new speaker, is it better to apply speaker adaptation, or to use accent identification (AID) to identify the speaker's accent followed by accent-dependent ASR? Three approaches to accent-dependent modelling are investigated: using the `correct' accent model, choosing a model using supervised (ACCDIST-based) accent identification (AID), and building a model using data from neighbouring speakers in `AID space'. All of the methods outperform the accent-independent model, with relative reductions in ASR error rate of up to 44%. Using on average 43s of speech to identify an appropriate accent-dependent model outperforms using it for supervised speaker-adaptation, by 7%.

Details

Original languageEnglish
Title of host publication2014 Proceedings of the 22nd European Signal Processing Conference (EUSIPCO)
Publication statusPublished - 2014
Event22nd European Signal Processing Conference, EUSIPCO 2014 - Lisbon, United Kingdom
Duration: 1 Sep 20145 Sep 2014

Conference

Conference22nd European Signal Processing Conference, EUSIPCO 2014
CountryUnited Kingdom
CityLisbon
Period1/09/145/09/14

Keywords

  • speech recognition, accent recognition