Unsupervised Model Selection for Recognition of Regional Accented Speech

Maryam Najafian, Andrea DeMarco, Stephen Cox, Martin Russell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Citations (Scopus)
143 Downloads (Pure)

Abstract

This paper is concerned with automatic speech recognition (ASR) for accented speech. Given a small amount of speech from a new speaker, is it better to apply speaker adaptation to the baseline, or to use accent identification (AID) to identify the speaker’s accent and select an accent-dependent acoustic model? Three accent-based model selection methods are investigated: using the ‘true’ accent model, and unsupervised model selection using i-Vector and phonotactic-based AID. All three methods outperform the unadapted baseline. Most significantly, AID-based model selection using 43s of speech performs better than unsupervised speaker adaptation, even if the latter uses
five times more adaptation data. Combining unsupervised AIDbased model selection and speaker adaptation gives an average relative reduction in ASR error rate of up to 47%.
Original languageEnglish
Title of host publicationINTERSPEECH 2014
PublisherISCA
Pages2967-2971
Number of pages5
Publication statusPublished - 2014
EventInterspeech 2014 - Singapore, Singapore
Duration: 14 Sept 201418 Sept 2014

Conference

ConferenceInterspeech 2014
Country/TerritorySingapore
CitySingapore
Period14/09/1418/09/14

Keywords

  • speech recognition
  • Model selection
  • accent recognition
  • i-vector

Fingerprint

Dive into the research topics of 'Unsupervised Model Selection for Recognition of Regional Accented Speech'. Together they form a unique fingerprint.

Cite this