Abstract
This paper examines the extent to which computer speech
recognition errors for children’s speech can be attributed to
common phonological effects associated with language acquisition.
Recognition results are presented for three corpora of children’s
speech, two comprising recordings of American English
spoken by five- to nine-year-olds and one comprising recordings
of British English speech from children aged five and six.
The results are compared with adult reference confusion matrices
based on TIMIT for the first two experiments and with
confusion matrices for British adults and children with good
speech for the third. They appear to be influenced by three factors:
(i) confusions that are predictable from phonological factors
associated with language acquisition also arise from acoustic
confusability (e.g. /k/->/t/) , (ii) the frequency of the
phonological errors is expected to decrease with increasing age,
and (iii) an accurate recogniser is more likely to detect a phonological
error when it occurs than a less accurate one. Overall
the percentage of errors attributable to phonological processes
remains approximately constant in each experiment. However,
the proportion of these errors that differ significantly from reference
patterns increases with recognition accuracy and is greater
for children who are judged to have poor speech.
Original language | English |
---|---|
Title of host publication | Proceedings of Workshop on Child-Computer Interaction (WOCCI 2016) |
Publisher | ISCA |
Pages | 10-15 |
Number of pages | 6 |
DOIs | |
Publication status | Published - 6 Sept 2016 |