Multi-class Hierarchical Question Classification for Multiple Choice Science Exams

Dongfang Xu; Peter Jansen; Jaycie Martin; Zhengnan Xie; Vikas Yadav; Harish Tayyar Madabushi; Oyvind Tafjord; Peter Clark

Multi-class Hierarchical Question Classification for Multiple Choice Science Exams

Dongfang Xu, Peter Jansen, Jaycie Martin, Zhengnan Xie, Vikas Yadav, Harish Tayyar Madabushi, Oyvind Tafjord, Peter Clark

Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

2 Citations (Scopus)

Abstract

Prior work has demonstrated that question classification (QC), recognizing the problem domain of a question, can help answer it more accurately. However, developing strong QC algorithms has been hindered by the limited size and complexity of annotated data available. To address this, we present the largest challenge dataset for QC, containing 7,787 science exam questions paired with detailed classification labels from a fine-grained hierarchical taxonomy of 406 problem domains. We then show that a BERT-based model trained on this dataset achieves a large (+0.12 MAP) gain compared with previous methods, while also achieving state-of-the-art performance on benchmark open-domain and biomedical QC datasets. Finally, we show that using this model’s predictions of question topic significantly improves the accuracy of a question answering system by +1.7% P@1, with substantial future gains possible as QC performance improves.

Original language	English
Title of host publication	Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020)
Editors	Nicoletta Calzolari
Publisher	European Language Resources Association (ELRA)
Pages	5370-5382
ISBN (Electronic)	9791095546344
Publication status	Published - 13 May 2020
Event	12th Conference on Language Resources and Evaluation (LREC 2020) - Marseille, France Duration: 13 May 2020 → 16 May 2020

Publication series

Name	Language Resources and Evaluation (LREC) proceedings
Publisher	European Language Resources Association
ISSN (Electronic)	2522-2686

Conference

Conference	12th Conference on Language Resources and Evaluation (LREC 2020)
Country/Territory	France
City	Marseille
Period	13/05/20 → 16/05/20

Keywords

question answering
question classification

Access to Document

https://www.aclweb.org/anthology/2020.lrec-1.661.pdfLicence: Creative Commons: Attribution-NonCommercial (CC BY-NC)

Cite this

Xu, D., Jansen, P., Martin, J., Xie, Z., Yadav, V., Tayyar Madabushi, H., Tafjord, O., & Clark, P. (2020). Multi-class Hierarchical Question Classification for Multiple Choice Science Exams. In N. Calzolari (Ed.), Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020) (pp. 5370-5382). (Language Resources and Evaluation (LREC) proceedings). European Language Resources Association (ELRA). https://www.aclweb.org/anthology/2020.lrec-1.661.pdf

Xu, Dongfang ; Jansen, Peter ; Martin, Jaycie et al. / Multi-class Hierarchical Question Classification for Multiple Choice Science Exams. Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020). editor / Nicoletta Calzolari. European Language Resources Association (ELRA), 2020. pp. 5370-5382 (Language Resources and Evaluation (LREC) proceedings).

@inproceedings{4b69f9c404564d78befa45d6716479d4,

title = "Multi-class Hierarchical Question Classification for Multiple Choice Science Exams",

abstract = "Prior work has demonstrated that question classification (QC), recognizing the problem domain of a question, can help answer it more accurately. However, developing strong QC algorithms has been hindered by the limited size and complexity of annotated data available. To address this, we present the largest challenge dataset for QC, containing 7,787 science exam questions paired with detailed classification labels from a fine-grained hierarchical taxonomy of 406 problem domains. We then show that a BERT-based model trained on this dataset achieves a large (+0.12 MAP) gain compared with previous methods, while also achieving state-of-the-art performance on benchmark open-domain and biomedical QC datasets. Finally, we show that using this model{\textquoteright}s predictions of question topic significantly improves the accuracy of a question answering system by +1.7% P@1, with substantial future gains possible as QC performance improves.",

keywords = "question answering, question classification",

author = "Dongfang Xu and Peter Jansen and Jaycie Martin and Zhengnan Xie and Vikas Yadav and {Tayyar Madabushi}, Harish and Oyvind Tafjord and Peter Clark",

year = "2020",

month = may,

day = "13",

language = "English",

series = "Language Resources and Evaluation (LREC) proceedings",

publisher = "European Language Resources Association (ELRA)",

pages = "5370--5382",

editor = "Nicoletta Calzolari",

booktitle = "Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020)",

note = "12th Conference on Language Resources and Evaluation (LREC 2020) ; Conference date: 13-05-2020 Through 16-05-2020",

}

Xu, D, Jansen, P, Martin, J, Xie, Z, Yadav, V, Tayyar Madabushi, H, Tafjord, O & Clark, P 2020, Multi-class Hierarchical Question Classification for Multiple Choice Science Exams. in N Calzolari (ed.), Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020). Language Resources and Evaluation (LREC) proceedings, European Language Resources Association (ELRA), pp. 5370-5382, 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, France, 13/05/20. <https://www.aclweb.org/anthology/2020.lrec-1.661.pdf>

Multi-class Hierarchical Question Classification for Multiple Choice Science Exams. / Xu, Dongfang; Jansen, Peter; Martin, Jaycie et al.
Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020). ed. / Nicoletta Calzolari. European Language Resources Association (ELRA), 2020. p. 5370-5382 (Language Resources and Evaluation (LREC) proceedings).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

TY - GEN

T1 - Multi-class Hierarchical Question Classification for Multiple Choice Science Exams

AU - Xu, Dongfang

AU - Jansen, Peter

AU - Martin, Jaycie

AU - Xie, Zhengnan

AU - Yadav, Vikas

AU - Tayyar Madabushi, Harish

AU - Tafjord, Oyvind

AU - Clark, Peter

PY - 2020/5/13

Y1 - 2020/5/13

N2 - Prior work has demonstrated that question classification (QC), recognizing the problem domain of a question, can help answer it more accurately. However, developing strong QC algorithms has been hindered by the limited size and complexity of annotated data available. To address this, we present the largest challenge dataset for QC, containing 7,787 science exam questions paired with detailed classification labels from a fine-grained hierarchical taxonomy of 406 problem domains. We then show that a BERT-based model trained on this dataset achieves a large (+0.12 MAP) gain compared with previous methods, while also achieving state-of-the-art performance on benchmark open-domain and biomedical QC datasets. Finally, we show that using this model’s predictions of question topic significantly improves the accuracy of a question answering system by +1.7% P@1, with substantial future gains possible as QC performance improves.

AB - Prior work has demonstrated that question classification (QC), recognizing the problem domain of a question, can help answer it more accurately. However, developing strong QC algorithms has been hindered by the limited size and complexity of annotated data available. To address this, we present the largest challenge dataset for QC, containing 7,787 science exam questions paired with detailed classification labels from a fine-grained hierarchical taxonomy of 406 problem domains. We then show that a BERT-based model trained on this dataset achieves a large (+0.12 MAP) gain compared with previous methods, while also achieving state-of-the-art performance on benchmark open-domain and biomedical QC datasets. Finally, we show that using this model’s predictions of question topic significantly improves the accuracy of a question answering system by +1.7% P@1, with substantial future gains possible as QC performance improves.

KW - question answering

KW - question classification

UR - http://www.lrec-conf.org/proceedings/lrec2020/LREC-2020.pdf

M3 - Conference contribution

T3 - Language Resources and Evaluation (LREC) proceedings

SP - 5370

EP - 5382

BT - Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020)

A2 - Calzolari, Nicoletta

PB - European Language Resources Association (ELRA)

T2 - 12th Conference on Language Resources and Evaluation (LREC 2020)

Y2 - 13 May 2020 through 16 May 2020

ER -

Xu D, Jansen P, Martin J, Xie Z, Yadav V, Tayyar Madabushi H et al. Multi-class Hierarchical Question Classification for Multiple Choice Science Exams. In Calzolari N, editor, Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020). European Language Resources Association (ELRA). 2020. p. 5370-5382. (Language Resources and Evaluation (LREC) proceedings).

Multi-class Hierarchical Question Classification for Multiple Choice Science Exams

Abstract

Publication series

Conference

Keywords

Access to Document

Fingerprint

Cite this