UoB at ProfNER 2021: data augmentation for classification using machine translation

Frances Adriana Laureano De Leon, Harish Tayyar Madabushi, Mark Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)
41 Downloads (Pure)

Abstract

This paper describes the participation of the UoB-NLP team in the ProfNER-ST shared subtask 7a. The task was aimed at detecting the mention of professions in social media text. Our team experimented with two methods of improving the performance of pre-trained models: Specifically, we experimented with data augmentation through translation and the merging of multiple language inputs to meet the objective of the task. While the best performing model on the test data consisted of mBERT fine-tuned on augmented data using back-translation, the improvement is minor possibly because multi-lingual pre-trained models such as mBERT already have access to the kind of information provided through back-translation and bilingual data.
Original languageEnglish
Title of host publicationProceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task
EditorsArjun Magge, Ari Klein, Antonio Miranda-Escalada, Mohammed Ali Al-garadi, Ilseyar Alimova, Zulfat Miftahutdinov, Eulalia Farre-Maduell, Salvador Lima-Lopez, Ivan Flores, Karen O'Connor, Davy Weissenbacher, Elena Tutubalina, Abeed Sarker, Juan M Banda, Martin Krallinger, Graciela Gonzalez-Hernandez
PublisherAssociation for Computational Linguistics, ACL
Pages115–117
Number of pages3
ISBN (Electronic)9781954085312
DOIs
Publication statusPublished - 10 Jun 2021
EventSixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task - Mexico City, Mexico and Online, Mexico City, Mexico
Duration: 10 Jun 202110 Jun 2021
Conference number: 6th

Publication series

NameSocial Media Mining for Health (SMM4H)

Conference

ConferenceSixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task
Country/TerritoryMexico
CityMexico City
Period10/06/2110/06/21

Bibliographical note

Publisher Copyright:
© 2021 Association for Computational Linguistics.

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems
  • Software

Fingerprint

Dive into the research topics of 'UoB at ProfNER 2021: data augmentation for classification using machine translation'. Together they form a unique fingerprint.

Cite this