An autoencoder-based representation for noise reduction in distant supervision of relation extraction

Juan Luis García-Mendoza, Luis Villaseñor-Pineda, Felipe Orihuela-Espina, Lázaro Bustio-Martínez

Research output: Contribution to journalArticlepeer-review

Abstract

Distant Supervision is an approach that allows automatic labeling of instances. This approach has been used in Relation Extraction. Still, the main challenge of this task is handling instances with noisy labels (e.g., when two entities in a sentence are automatically labeled with an invalid relation). The approaches reported in the literature addressed this problem by employing noise-tolerant classifiers. However, if a noise reduction stage is introduced before the classification step, this increases the macro precision values. This paper proposes an Adversarial Autoencoders-based approach for obtaining a new representation that allows noise reduction in Distant Supervision. The representation obtained using Adversarial Autoencoders minimize the intra-cluster distance concerning pre-trained embeddings and classic Autoencoders. Experiments demonstrated that in the noise-reduced datasets, the macro precision values obtained over the original dataset are similar using fewer instances considering the same classifier. For example, in one of the noise-reduced datasets, the macro precision was improved approximately 2.32% using 77% of the original instances. This suggests the validity of using Adversarial Autoencoders to obtain well-suited representations for noise reduction. Also, the proposed approach maintains the macro precision values concerning the original dataset and reduces the total instances needed for classification.

Original languageEnglish
Pages (from-to)4523-4529
Number of pages7
JournalJournal of Intelligent and Fuzzy Systems
Volume42
Issue number5
DOIs
Publication statusPublished - 31 Mar 2022

Bibliographical note

Funding Information:
The present work was supported by CONACyT/Mexico (scholarship 937210 and grant CB-2015-01-257383). Additionally, the authors thank CONACYT for the computer resources provided through the INAOE Supercomputing Laboratory's Deep Learning Platform for Language Technologies. Finally, we would like to thank Dr. Miguel A. Alvarez-Carmona from CICESE-UT3 for his comments and suggestions

Publisher Copyright:
© 2022 - IOS Press. All rights reserved.

Keywords

  • adversarial autoencoders
  • distant supervision
  • Noise reduction

ASJC Scopus subject areas

  • Statistics and Probability
  • Engineering(all)
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'An autoencoder-based representation for noise reduction in distant supervision of relation extraction'. Together they form a unique fingerprint.

Cite this