Can vectors read minds better than experts? Comparing data augmentation strategies for the automated scoring of children’s mindreading ability

Research output: Chapter in Book/Report/Conference proceedingChapter (peer-reviewed)peer-review

Standard

Can vectors read minds better than experts? Comparing data augmentation strategies for the automated scoring of children’s mindreading ability. / Kovatchev, Venelin; Smith, Phillip; Lee, Mark; Devine, R.T.

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). ed. / Chengqing Zong; Fei Xia; Wenjie Li; Roberto Navigli. Association for Computational Linguistics, ACL, 2021. (International Joint Conference on Natural Language Processing (IJCNLP)).

Research output: Chapter in Book/Report/Conference proceedingChapter (peer-reviewed)peer-review

Harvard

Kovatchev, V, Smith, P, Lee, M & Devine, RT 2021, Can vectors read minds better than experts? Comparing data augmentation strategies for the automated scoring of children’s mindreading ability. in C Zong, F Xia, W Li & R Navigli (eds), Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). International Joint Conference on Natural Language Processing (IJCNLP), Association for Computational Linguistics, ACL, The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing , Bangkok, Thailand, 1/08/21. <https://arxiv.org/abs/2106.01635>

APA

Kovatchev, V., Smith, P., Lee, M., & Devine, R. T. (2021). Can vectors read minds better than experts? Comparing data augmentation strategies for the automated scoring of children’s mindreading ability. In C. Zong, F. Xia, W. Li, & R. Navigli (Eds.), Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (International Joint Conference on Natural Language Processing (IJCNLP)). Association for Computational Linguistics, ACL. https://arxiv.org/abs/2106.01635

Vancouver

Kovatchev V, Smith P, Lee M, Devine RT. Can vectors read minds better than experts? Comparing data augmentation strategies for the automated scoring of children’s mindreading ability. In Zong C, Xia F, Li W, Navigli R, editors, Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, ACL. 2021. (International Joint Conference on Natural Language Processing (IJCNLP)).

Author

Kovatchev, Venelin ; Smith, Phillip ; Lee, Mark ; Devine, R.T. / Can vectors read minds better than experts? Comparing data augmentation strategies for the automated scoring of children’s mindreading ability. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). editor / Chengqing Zong ; Fei Xia ; Wenjie Li ; Roberto Navigli. Association for Computational Linguistics, ACL, 2021. (International Joint Conference on Natural Language Processing (IJCNLP)).

Bibtex

@inbook{f77f24e791b2449eb17bd8c9d99f0ca5,
title = "Can vectors read minds better than experts? Comparing data augmentation strategies for the automated scoring of children{\textquoteright}s mindreading ability",
abstract = "In this paper we implement and compare 7 different data augmentation strategies for the task of automatic scoring of children{\textquoteright}s ability to understand others{\textquoteright} thoughts, feelings, and desires (or “mindreading”). We recruit in-domain experts to re-annotate augmented samples and determine to what extent each strategy preserves the original rating. We also carry out multiple experiments to measure how much each augmentation strategy improves the performance of automatic scoring systems. To determine the capabilities of automatic systems to generalize to unseen data, we create UK-MIND-20 - a new corpus of children{\textquoteright}s performance on tests of mindreading, consisting of 10,320 question-answer pairs. We obtain a new state-of-the-art performance on the MIND-CA corpus, improving macro-F1-score by 6 points. Results indicate that both the number of training examples and the quality of the augmentation strategies affect the performance of the systems. The task-specific augmentations generally outperform task-agnostic augmentations. Automatic augmentations based on vectors (GloVe, FastText) perform the worst. We find that systems trained on MIND-CA generalize well to UK-MIND-20. We demonstrate that data augmentation strategies also improve the performance on unseen data.",
author = "Venelin Kovatchev and Phillip Smith and Mark Lee and R.T. Devine",
year = "2021",
month = jul,
day = "27",
language = "English",
isbn = "978-1-954085-52-7",
series = "International Joint Conference on Natural Language Processing (IJCNLP)",
publisher = "Association for Computational Linguistics, ACL",
editor = "Chengqing Zong and Fei Xia and Wenjie Li and Roberto Navigli",
booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)",
note = "The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing , ACL-IJCNLP 2021 ; Conference date: 01-08-2021 Through 06-08-2021",

}

RIS

TY - CHAP

T1 - Can vectors read minds better than experts? Comparing data augmentation strategies for the automated scoring of children’s mindreading ability

AU - Kovatchev, Venelin

AU - Smith, Phillip

AU - Lee, Mark

AU - Devine, R.T.

PY - 2021/7/27

Y1 - 2021/7/27

N2 - In this paper we implement and compare 7 different data augmentation strategies for the task of automatic scoring of children’s ability to understand others’ thoughts, feelings, and desires (or “mindreading”). We recruit in-domain experts to re-annotate augmented samples and determine to what extent each strategy preserves the original rating. We also carry out multiple experiments to measure how much each augmentation strategy improves the performance of automatic scoring systems. To determine the capabilities of automatic systems to generalize to unseen data, we create UK-MIND-20 - a new corpus of children’s performance on tests of mindreading, consisting of 10,320 question-answer pairs. We obtain a new state-of-the-art performance on the MIND-CA corpus, improving macro-F1-score by 6 points. Results indicate that both the number of training examples and the quality of the augmentation strategies affect the performance of the systems. The task-specific augmentations generally outperform task-agnostic augmentations. Automatic augmentations based on vectors (GloVe, FastText) perform the worst. We find that systems trained on MIND-CA generalize well to UK-MIND-20. We demonstrate that data augmentation strategies also improve the performance on unseen data.

AB - In this paper we implement and compare 7 different data augmentation strategies for the task of automatic scoring of children’s ability to understand others’ thoughts, feelings, and desires (or “mindreading”). We recruit in-domain experts to re-annotate augmented samples and determine to what extent each strategy preserves the original rating. We also carry out multiple experiments to measure how much each augmentation strategy improves the performance of automatic scoring systems. To determine the capabilities of automatic systems to generalize to unseen data, we create UK-MIND-20 - a new corpus of children’s performance on tests of mindreading, consisting of 10,320 question-answer pairs. We obtain a new state-of-the-art performance on the MIND-CA corpus, improving macro-F1-score by 6 points. Results indicate that both the number of training examples and the quality of the augmentation strategies affect the performance of the systems. The task-specific augmentations generally outperform task-agnostic augmentations. Automatic augmentations based on vectors (GloVe, FastText) perform the worst. We find that systems trained on MIND-CA generalize well to UK-MIND-20. We demonstrate that data augmentation strategies also improve the performance on unseen data.

UR - https://2021.aclweb.org/

UR - https://www.aclweb.org/anthology/venues/acl/

UR - https://www.aclweb.org/anthology/faq/

M3 - Chapter (peer-reviewed)

SN - 978-1-954085-52-7

T3 - International Joint Conference on Natural Language Processing (IJCNLP)

BT - Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

A2 - Zong, Chengqing

A2 - Xia, Fei

A2 - Li, Wenjie

A2 - Navigli, Roberto

PB - Association for Computational Linguistics, ACL

T2 - The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing

Y2 - 1 August 2021 through 6 August 2021

ER -