The University of Birmingham 2018 spoken CALL shared task systems

Mengjie Qian, Xizi Wei, Peter Jancovic, Martin Russell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)
276 Downloads (Pure)


This paper describes the systems developed by the University of Birmingham for the 2018 CALL Shared Task (ST) challenge. The task is to perform automatic assessment of grammatical and linguistic aspects of English spoken by German-speaking Swiss teenagers. Our developed systems consist of two components, automatic speech recognition (ASR) and text processing (TP). We explore several ways of building a DNN-HMM ASR system using out-of-domain AMI speech corpus plus a limited amount of ST data. In development experiments on the initial ST data, our final ASR system achieved the word-error-rate (WER) of 12.00%, compared to 14.89% for the official ST baseline DNN-HMM system. The WER of 9.28% was achieved on the test set data. For TP component, we first post-process the ASR output to deal with hesitations and then pass this to a template-based grammar, which we expanded from the provided baseline. We also developed a TP system based on machine learning methods, which enables to better accommodate variability of spoken language. We also fused outputs from several systems using a linear logistic regression. Our best system submitted to the challenge achieved F-measure of 0.914, D of 10.764 and D_{full} score of 5.691 on the final test set.
Original languageEnglish
Title of host publicationProceedings of Interspeech 2018
Place of PublicationHyderabad, India
Number of pages5
Publication statusPublished - 3 Sept 2018
EventInterspeech 2018 - Hyderabad International Convention Centre, Hyderabad , India
Duration: 2 Sept 20186 Sept 2018

Publication series

ISSN (Electronic)1990-9772


ConferenceInterspeech 2018


  • Spoken CALL
  • Shared Task
  • speech recognition
  • text processing
  • rule-based grammar
  • word2vec


Dive into the research topics of 'The University of Birmingham 2018 spoken CALL shared task systems'. Together they form a unique fingerprint.

Cite this