Automatic speaker, age-group and gender identification from children's speech

Saeid Safavi, Martin Russell, Peter Jancovic

Research output: Contribution to journalArticlepeer-review

30 Citations (Scopus)
466 Downloads (Pure)

Abstract

A speech signal contains important paralinguistic information, such as the identity, age, gender, language, accent, and the emotional state of the speaker. Automatic recognition of these types of information in adults' speech has received considerable attention, however there has been little work on children's speech. This paper focuses on speaker, gender, and age-group recognition from children's speech. The performances of several classification methods are compared, including Gaussian Mixture Model - Universal Background Model (GMM-UBM), GMM - Support Vector Machine (GMM-SVM) and i-vector based approaches. For speaker recognition, error rate decreases as age increases, as one might expect. However for gender and age-group recognition the effect of age is more complex due mainly to consequences of the onset of puberty. Finally, the utility of different frequency bands for speaker, age-group and gender recognition from children's speech is assessed.
Original languageEnglish
Pages (from-to)141-156
Number of pages16
JournalComputer Speech and Language
Volume50
Early online date9 Jan 2018
DOIs
Publication statusPublished - Jul 2018

Keywords

  • speaker recognition
  • gender identification
  • children's speech
  • age-group identification
  • Guassian Mixture Model
  • Support Vector Machine
  • i-vector

Fingerprint

Dive into the research topics of 'Automatic speaker, age-group and gender identification from children's speech'. Together they form a unique fingerprint.

Cite this