Evaluation of a new representation for noise reduction in distant supervision

Juan Luis García-Mendoza*, Luis Villaseñor-Pineda, Davide Buscaldi, Lázaro Bustio-Martínez, Felipe Orihuela-Espina

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

35 Downloads (Pure)

Abstract

Distant Supervision is a relation extraction approach that allows automatic labeling of a dataset. However, this labeling introduces noise in the labels (e.g., when two entities in a sentence are automatically labeled with an invalid relation). Noise in labels makes difficult the relation extraction task. This noise is precisely one of the main challenges of this task. Until now, the methods that incorporate a previous noise reduction step do not evaluate the performance of this step. This paper evaluates the noise reduction using a new representation obtained with autoencoders. In addition, it was incoporated more information to the input of the autoencoder proposed in the state-of-the-art to improve the representation over which the noise is reduced. Also, three methods were proposed to select the instances considered as real. As a result, it was obtained the highest values of the area under the ROC curves using the improved input combined with state-of-the-art anomaly detection methods. Moreover, the three proposed selection methods significantly improve the existing method in the literature.

Original languageEnglish
Title of host publicationAdvances in Computational Intelligence
Subtitle of host publication21st Mexican International Conference on Artificial Intelligence, MICAI 2022, Proceedings, Part II
EditorsObdulia Pichardo Lagunas, Bella Martínez Seis, Juan Martínez-Miranda
PublisherSpringer
Pages101-113
Number of pages13
Edition1
ISBN (Electronic)9783031194962
ISBN (Print)9783031194955
DOIs
Publication statusPublished - 25 Oct 2022
Event21st Mexican International Conference on Artificial Intelligence, MICAI 2022 - Monterrey, Mexico
Duration: 24 Oct 202229 Oct 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13613 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st Mexican International Conference on Artificial Intelligence, MICAI 2022
Country/TerritoryMexico
CityMonterrey
Period24/10/2229/10/22

Bibliographical note

Funding Information:
The present work was supported by CONACyT/México (schol-arship 937210 and grant CB-2015-01-257383) and Labex EFL through EFL mobility grants. Additionally, the authors thank CONACYT for the computer resources provided through the INAOE Supercomputing Laboratory’s Deep Learning Platform for Language Technologies.

Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Keywords

  • Adversarial autoencoders
  • Data representation
  • Distant supervision
  • Noise reduction

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Evaluation of a new representation for noise reduction in distant supervision'. Together they form a unique fingerprint.

Cite this