The University of Birmingham 2018 spoken CALL shared task systems

Mengjie Qian; Xizi Wei; Peter Jancovic; Martin Russell

doi:10.21437/Interspeech.2018-1372

The University of Birmingham 2018 spoken CALL shared task systems

Mengjie Qian, Xizi Wei, Peter Jancovic, Martin Russell

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

1 Citation (Scopus)

270 Downloads (Pure)

Abstract

This paper describes the systems developed by the University of Birmingham for the 2018 CALL Shared Task (ST) challenge. The task is to perform automatic assessment of grammatical and linguistic aspects of English spoken by German-speaking Swiss teenagers. Our developed systems consist of two components, automatic speech recognition (ASR) and text processing (TP). We explore several ways of building a DNN-HMM ASR system using out-of-domain AMI speech corpus plus a limited amount of ST data. In development experiments on the initial ST data, our final ASR system achieved the word-error-rate (WER) of 12.00%, compared to 14.89% for the official ST baseline DNN-HMM system. The WER of 9.28% was achieved on the test set data. For TP component, we first post-process the ASR output to deal with hesitations and then pass this to a template-based grammar, which we expanded from the provided baseline. We also developed a TP system based on machine learning methods, which enables to better accommodate variability of spoken language. We also fused outputs from several systems using a linear logistic regression. Our best system submitted to the challenge achieved F-measure of 0.914, D of 10.764 and D_{full} score of 5.691 on the final test set.

Original language	English
Title of host publication	Proceedings of Interspeech 2018
Place of Publication	Hyderabad, India
Publisher	ISCA
Pages	2374-2378
Number of pages	5
DOIs	https://doi.org/10.21437/Interspeech.2018-1372
Publication status	Published - 3 Sept 2018
Event	Interspeech 2018 - Hyderabad International Convention Centre, Hyderabad , India Duration: 2 Sept 2018 → 6 Sept 2018

Publication series

Name	Interspeech
Volume	2018
ISSN (Electronic)	1990-9772

Conference

Conference	Interspeech 2018
Country/Territory	India
City	Hyderabad
Period	2/09/18 → 6/09/18

Keywords

Spoken CALL
Shared Task
speech recognition
text processing
DNN-HMM
rule-based grammar
word2vec

Access to Document

10.21437/Interspeech.2018-1372Licence: None: All rights reserved

Mengjie_Qian_et_al_The_University_of_Birmingham_2018_spoken_CALL_shared_task_systems_Proc_Interspeech_2018
Checked for eligibility: 13/09/2018 Qian, M., Wei, X., Jančovič, P., Russell, M. (2018) The University of Birmingham 2018 Spoken CALL Shared Task Systems. Proc. Interspeech 2018, 2374-2378, DOI: 10.21437/Interspeech.2018-1372.
Final published version, 652 KBLicence: Other (please specify with Rights Statement)

Cite this

@inproceedings{15a78eaeb9164958966af696e1f53be8,

title = "The University of Birmingham 2018 spoken CALL shared task systems",

abstract = "This paper describes the systems developed by the University of Birmingham for the 2018 CALL Shared Task (ST) challenge. The task is to perform automatic assessment of grammatical and linguistic aspects of English spoken by German-speaking Swiss teenagers. Our developed systems consist of two components, automatic speech recognition (ASR) and text processing (TP). We explore several ways of building a DNN-HMM ASR system using out-of-domain AMI speech corpus plus a limited amount of ST data. In development experiments on the initial ST data, our final ASR system achieved the word-error-rate (WER) of 12.00%, compared to 14.89% for the official ST baseline DNN-HMM system. The WER of 9.28% was achieved on the test set data. For TP component, we first post-process the ASR output to deal with hesitations and then pass this to a template-based grammar, which we expanded from the provided baseline. We also developed a TP system based on machine learning methods, which enables to better accommodate variability of spoken language. We also fused outputs from several systems using a linear logistic regression. Our best system submitted to the challenge achieved F-measure of 0.914, D of 10.764 and D_{full} score of 5.691 on the final test set.",

keywords = "Spoken CALL, Shared Task, speech recognition, text processing, DNN-HMM, rule-based grammar, word2vec",

author = "Mengjie Qian and Xizi Wei and Peter Jancovic and Martin Russell",

year = "2018",

month = sep,

day = "3",

doi = "10.21437/Interspeech.2018-1372",

language = "English",

series = "Interspeech",

publisher = "ISCA",

pages = "2374--2378",

booktitle = "Proceedings of Interspeech 2018",

note = "Interspeech 2018 ; Conference date: 02-09-2018 Through 06-09-2018",

}

TY - GEN

T1 - The University of Birmingham 2018 spoken CALL shared task systems

AU - Qian, Mengjie

AU - Wei, Xizi

AU - Jancovic, Peter

AU - Russell, Martin

PY - 2018/9/3

Y1 - 2018/9/3

N2 - This paper describes the systems developed by the University of Birmingham for the 2018 CALL Shared Task (ST) challenge. The task is to perform automatic assessment of grammatical and linguistic aspects of English spoken by German-speaking Swiss teenagers. Our developed systems consist of two components, automatic speech recognition (ASR) and text processing (TP). We explore several ways of building a DNN-HMM ASR system using out-of-domain AMI speech corpus plus a limited amount of ST data. In development experiments on the initial ST data, our final ASR system achieved the word-error-rate (WER) of 12.00%, compared to 14.89% for the official ST baseline DNN-HMM system. The WER of 9.28% was achieved on the test set data. For TP component, we first post-process the ASR output to deal with hesitations and then pass this to a template-based grammar, which we expanded from the provided baseline. We also developed a TP system based on machine learning methods, which enables to better accommodate variability of spoken language. We also fused outputs from several systems using a linear logistic regression. Our best system submitted to the challenge achieved F-measure of 0.914, D of 10.764 and D_{full} score of 5.691 on the final test set.

AB - This paper describes the systems developed by the University of Birmingham for the 2018 CALL Shared Task (ST) challenge. The task is to perform automatic assessment of grammatical and linguistic aspects of English spoken by German-speaking Swiss teenagers. Our developed systems consist of two components, automatic speech recognition (ASR) and text processing (TP). We explore several ways of building a DNN-HMM ASR system using out-of-domain AMI speech corpus plus a limited amount of ST data. In development experiments on the initial ST data, our final ASR system achieved the word-error-rate (WER) of 12.00%, compared to 14.89% for the official ST baseline DNN-HMM system. The WER of 9.28% was achieved on the test set data. For TP component, we first post-process the ASR output to deal with hesitations and then pass this to a template-based grammar, which we expanded from the provided baseline. We also developed a TP system based on machine learning methods, which enables to better accommodate variability of spoken language. We also fused outputs from several systems using a linear logistic regression. Our best system submitted to the challenge achieved F-measure of 0.914, D of 10.764 and D_{full} score of 5.691 on the final test set.

KW - Spoken CALL

KW - Shared Task

KW - speech recognition

KW - text processing

KW - DNN-HMM

KW - rule-based grammar

KW - word2vec

U2 - 10.21437/Interspeech.2018-1372

DO - 10.21437/Interspeech.2018-1372

M3 - Conference contribution

T3 - Interspeech

SP - 2374

EP - 2378

BT - Proceedings of Interspeech 2018

PB - ISCA

CY - Hyderabad, India

T2 - Interspeech 2018

Y2 - 2 September 2018 through 6 September 2018

ER -

The University of Birmingham 2018 spoken CALL shared task systems

Abstract

Publication series

Conference

Keywords

Access to Document

Fingerprint

Cite this