Can vectors read minds better than experts? Comparing data augmentation strategies for the automated scoring of children's mindreading ability

Research output: Chapter in Book/Report/Conference proceedingConference contribution

95 Downloads (Pure)

Abstract

In this paper we implement and compare 7 different data augmentation strategies for the task of automatic scoring of children's ability to understand others' thoughts, feelings, and desires (or “mindreading”). We recruit in-domain experts to re-annotate augmented samples and determine to what extent each strategy preserves the original rating. We also carry out multiple experiments to measure how much each augmentation strategy improves the performance of automatic scoring systems. To determine the capabilities of automatic systems to generalize to unseen data, we create UK-MIND-20 - a new corpus of children's performance on tests of mindreading, consisting of 10,320 question-answer pairs. We obtain a new state-of-the-art performance on the MIND-CA corpus, improving macroF1-score by 6 points. Results indicate that both the number of training examples and the quality of the augmentation strategies affect the performance of the systems. The task-specific augmentations generally outperform task-agnostic augmentations. Automatic augmentations based on vectors (GloVe, FastText) perform the worst. We find that systems trained on MIND-CA generalize well to UK-MIND-20. We demonstrate that data augmentation strategies also improve the performance on unseen data.

Original languageEnglish
Title of host publicationProceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
EditorsChengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
PublisherAssociation for Computational Linguistics, ACL
Pages1196-1206
Number of pages11
Volume1
ISBN (Print)9781954085527
DOIs
Publication statusPublished - 6 Aug 2021
EventJoint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021 - Virtual, Online
Duration: 1 Aug 20216 Aug 2021

Publication series

NameInternational Joint Conference on Natural Language Processing (IJCNLP)

Conference

ConferenceJoint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021
CityVirtual, Online
Period1/08/216/08/21

Bibliographical note

Funding Information:
We would like to thank Imogen Grumley Traynor and Irene Luque Aguilera for the annotation and the creation of the lists of synonyms and phrases. We also want to thank the anonymous reviewers for their feedback and suggestions. This project was funded by a grant from Wellcome to R. T. Devine.

Publisher Copyright:
© 2021 Association for Computational Linguistics

ASJC Scopus subject areas

  • Software
  • Computational Theory and Mathematics
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Can vectors read minds better than experts? Comparing data augmentation strategies for the automated scoring of children's mindreading ability'. Together they form a unique fingerprint.

Cite this