Abstraction not memory: BERT and the English article system

Harish Tayyar Madabushi; Dagmar Divjak; Petar Milin

doi:10.48550/arXiv.2206.04184

Abstraction not memory: BERT and the English article system

Harish Tayyar Madabushi, Dagmar Divjak, Petar Milin

Research output: Working paper/Preprint › Preprint

31 Downloads (Pure)

Abstract

Article prediction is a task that has long defied accurate linguistic description. As such, this task is ideally suited to evaluate models on their ability to emulate native-speaker intuition. To this end, we compare the performance of native English speakers and pre-trained models on the task of article prediction set up as a three way choice (a/an, the, zero). Our experiments with BERT show that BERT outperforms humans on this task across all articles. In particular, BERT is far superior to humans at detecting the zero article, possibly because we insert them using rules that the deep neural model can easily pick up. More interestingly, we find that BERT tends to agree more with annotators than with the corpus when inter-annotator agreement is high but switches to agreeing more with the corpus as inter-annotator agreement drops. We contend that this alignment with annotators, despite being trained on the corpus, suggests that BERT is not memorising article use, but captures a high level generalisation of article use akin to human intuition.

Original language	English
Publisher	arXiv
DOIs	https://doi.org/10.48550/arXiv.2206.04184
Publication status	Published - 8 Jun 2022

Bibliographical note

Accepted for publication at 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022). Data and code available at https://github.com/H-TayyarMadabushi/Abstraction-not-Memory-BERT-and-the-English-Article-System-NAACL-2022

Keywords

cs.CL

Access to Document

10.48550/arXiv.2206.04184Licence: Creative Commons: Attribution-NonCommercial-ShareAlike (CC BY-NC-SA)

MadabushiH2022AbstractionOther version, 181 KBLicence: Creative Commons: Attribution-NonCommercial-ShareAlike (CC BY-NC-SA)

1 Conference contribution

Abstraction not Memory: BERT and the English Article System
Madabushi, H. T., Divjak, D. & Milin, P., Jul 2022, Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, ACL, p. 924-931 8 p. (North American Chapter of the Association for Computational Linguistics (NAACL)).
Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Open Access
File
12 Downloads (Pure)

Cite this

@techreport{dbc40be3cb394927bc8ddadf76d08f08,

title = "Abstraction not memory: BERT and the English article system",

abstract = "Article prediction is a task that has long defied accurate linguistic description. As such, this task is ideally suited to evaluate models on their ability to emulate native-speaker intuition. To this end, we compare the performance of native English speakers and pre-trained models on the task of article prediction set up as a three way choice (a/an, the, zero). Our experiments with BERT show that BERT outperforms humans on this task across all articles. In particular, BERT is far superior to humans at detecting the zero article, possibly because we insert them using rules that the deep neural model can easily pick up. More interestingly, we find that BERT tends to agree more with annotators than with the corpus when inter-annotator agreement is high but switches to agreeing more with the corpus as inter-annotator agreement drops. We contend that this alignment with annotators, despite being trained on the corpus, suggests that BERT is not memorising article use, but captures a high level generalisation of article use akin to human intuition.",

keywords = "cs.CL",

author = "Madabushi, {Harish Tayyar} and Dagmar Divjak and Petar Milin",

note = "Accepted for publication at 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022). Data and code available at https://github.com/H-TayyarMadabushi/Abstraction-not-Memory-BERT-and-the-English-Article-System-NAACL-2022",

year = "2022",

month = jun,

day = "8",

doi = "10.48550/arXiv.2206.04184",

language = "English",

publisher = "arXiv",

type = "WorkingPaper",

institution = "arXiv",

}

TY - UNPB

T1 - Abstraction not memory

T2 - BERT and the English article system

AU - Madabushi, Harish Tayyar

AU - Divjak, Dagmar

AU - Milin, Petar

N1 - Accepted for publication at 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022). Data and code available at https://github.com/H-TayyarMadabushi/Abstraction-not-Memory-BERT-and-the-English-Article-System-NAACL-2022

PY - 2022/6/8

Y1 - 2022/6/8

N2 - Article prediction is a task that has long defied accurate linguistic description. As such, this task is ideally suited to evaluate models on their ability to emulate native-speaker intuition. To this end, we compare the performance of native English speakers and pre-trained models on the task of article prediction set up as a three way choice (a/an, the, zero). Our experiments with BERT show that BERT outperforms humans on this task across all articles. In particular, BERT is far superior to humans at detecting the zero article, possibly because we insert them using rules that the deep neural model can easily pick up. More interestingly, we find that BERT tends to agree more with annotators than with the corpus when inter-annotator agreement is high but switches to agreeing more with the corpus as inter-annotator agreement drops. We contend that this alignment with annotators, despite being trained on the corpus, suggests that BERT is not memorising article use, but captures a high level generalisation of article use akin to human intuition.

AB - Article prediction is a task that has long defied accurate linguistic description. As such, this task is ideally suited to evaluate models on their ability to emulate native-speaker intuition. To this end, we compare the performance of native English speakers and pre-trained models on the task of article prediction set up as a three way choice (a/an, the, zero). Our experiments with BERT show that BERT outperforms humans on this task across all articles. In particular, BERT is far superior to humans at detecting the zero article, possibly because we insert them using rules that the deep neural model can easily pick up. More interestingly, we find that BERT tends to agree more with annotators than with the corpus when inter-annotator agreement is high but switches to agreeing more with the corpus as inter-annotator agreement drops. We contend that this alignment with annotators, despite being trained on the corpus, suggests that BERT is not memorising article use, but captures a high level generalisation of article use akin to human intuition.

KW - cs.CL

U2 - 10.48550/arXiv.2206.04184

DO - 10.48550/arXiv.2206.04184

M3 - Preprint

BT - Abstraction not memory

PB - arXiv

ER -

Abstraction not memory: BERT and the English article system

Abstract

Bibliographical note

Keywords

Access to Document

Fingerprint

Research output

Abstraction not Memory: BERT and the English Article System

Cite this