Effects of negation and uncertainty stratification on text-derived patient profile similarity

Luke T Slater; Andreas Karwath; Robert Hoehndorf; Georgios V Gkoutos

doi:10.3389/fdgth.2021.781227

Effects of negation and uncertainty stratification on text-derived patient profile similarity

Luke T Slater, Andreas Karwath, Robert Hoehndorf, Georgios V Gkoutos

Cancer and Genomic Sciences

Research output: Contribution to journal › Article › peer-review

20 Downloads (Pure)

Abstract

Semantic similarity is a useful approach for comparing patient phenotypes, and holds the potential of an effective method for exploiting text-derived phenotypes for differential diagnosis, text and document classification, and outcome prediction. While approaches for context disambiguation are commonly used in text mining applications, forming a standard component of information extraction pipelines, their effects on semantic similarity calculations have not been widely explored. In this work, we evaluate how inclusion and disclusion of negated and uncertain mentions of concepts from text-derived phenotypes affects similarity of patients, and the use of those profiles to predict diagnosis. We report on the effectiveness of these approaches and report a very small, yet significant, improvement in performance when classifying primary diagnosis over MIMIC-III patient visits.

Original language	English
Article number	781227
Number of pages	6
Journal	Frontiers in digital health
Volume	3
DOIs	https://doi.org/10.3389/fdgth.2021.781227
Publication status	Published - 6 Dec 2021

Bibliographical note

Keywords

context disambiguation
differential diagnosis
negation
ontology
phenotype profiles
semantic similarity

Access to Document

10.3389/fdgth.2021.781227Licence: Creative Commons: Attribution (CC BY)

SlaterL2021EffectsFinal published version, 377 KBLicence: Creative Commons: Attribution (CC BY)

Cite this

@article{d30da09a05614bf29e439361f6df7c76,

title = "Effects of negation and uncertainty stratification on text-derived patient profile similarity",

abstract = "Semantic similarity is a useful approach for comparing patient phenotypes, and holds the potential of an effective method for exploiting text-derived phenotypes for differential diagnosis, text and document classification, and outcome prediction. While approaches for context disambiguation are commonly used in text mining applications, forming a standard component of information extraction pipelines, their effects on semantic similarity calculations have not been widely explored. In this work, we evaluate how inclusion and disclusion of negated and uncertain mentions of concepts from text-derived phenotypes affects similarity of patients, and the use of those profiles to predict diagnosis. We report on the effectiveness of these approaches and report a very small, yet significant, improvement in performance when classifying primary diagnosis over MIMIC-III patient visits.",

keywords = "context disambiguation, differential diagnosis, negation, ontology, phenotype profiles, semantic similarity",

author = "Slater, {Luke T} and Andreas Karwath and Robert Hoehndorf and Gkoutos, {Georgios V}",

note = "Copyright {\textcopyright} 2021 Slater, Karwath, Hoehndorf and Gkoutos.",

year = "2021",

month = dec,

day = "6",

doi = "10.3389/fdgth.2021.781227",

language = "English",

volume = "3",

journal = "Frontiers in digital health",

issn = "2673-253X",

publisher = "Frontiers Media",

}

TY - JOUR

T1 - Effects of negation and uncertainty stratification on text-derived patient profile similarity

AU - Slater, Luke T

AU - Karwath, Andreas

AU - Hoehndorf, Robert

AU - Gkoutos, Georgios V

PY - 2021/12/6

Y1 - 2021/12/6

N2 - Semantic similarity is a useful approach for comparing patient phenotypes, and holds the potential of an effective method for exploiting text-derived phenotypes for differential diagnosis, text and document classification, and outcome prediction. While approaches for context disambiguation are commonly used in text mining applications, forming a standard component of information extraction pipelines, their effects on semantic similarity calculations have not been widely explored. In this work, we evaluate how inclusion and disclusion of negated and uncertain mentions of concepts from text-derived phenotypes affects similarity of patients, and the use of those profiles to predict diagnosis. We report on the effectiveness of these approaches and report a very small, yet significant, improvement in performance when classifying primary diagnosis over MIMIC-III patient visits.

AB - Semantic similarity is a useful approach for comparing patient phenotypes, and holds the potential of an effective method for exploiting text-derived phenotypes for differential diagnosis, text and document classification, and outcome prediction. While approaches for context disambiguation are commonly used in text mining applications, forming a standard component of information extraction pipelines, their effects on semantic similarity calculations have not been widely explored. In this work, we evaluate how inclusion and disclusion of negated and uncertain mentions of concepts from text-derived phenotypes affects similarity of patients, and the use of those profiles to predict diagnosis. We report on the effectiveness of these approaches and report a very small, yet significant, improvement in performance when classifying primary diagnosis over MIMIC-III patient visits.

KW - context disambiguation

KW - differential diagnosis

KW - negation

KW - ontology

KW - phenotype profiles

KW - semantic similarity

U2 - 10.3389/fdgth.2021.781227

DO - 10.3389/fdgth.2021.781227

M3 - Article

C2 - 34939069

SN - 2673-253X

VL - 3

JO - Frontiers in digital health

JF - Frontiers in digital health

M1 - 781227

ER -

Effects of negation and uncertainty stratification on text-derived patient profile similarity

Abstract

Bibliographical note

Keywords

Access to Document

Fingerprint

Cite this