Evaluating semantic similarity methods for comparison of text-derived phenotype profiles

Luke T Slater; Sophie Russell; Silver Makepeace; Alexander Carberry; Andreas Karwath; John A Williams; Hilary Fanning; Simon Ball; Robert Hoehndorf; Georgios V Gkoutos

doi:10.1186/s12911-022-01770-4

Evaluating semantic similarity methods for comparison of text-derived phenotype profiles

Luke T Slater, Sophie Russell, Silver Makepeace, Alexander Carberry, Andreas Karwath, John A Williams, Hilary Fanning, Simon Ball, Robert Hoehndorf, Georgios V Gkoutos

Research output: Contribution to journal › Article › peer-review

85 Downloads (Pure)

Abstract

BACKGROUND: Semantic similarity is a valuable tool for analysis in biomedicine. When applied to phenotype profiles derived from clinical text, they have the capacity to enable and enhance 'patient-like me' analyses, automated coding, differential diagnosis, and outcome prediction. While a large body of work exists exploring the use of semantic similarity for multiple tasks, including protein interaction prediction, and rare disease differential diagnosis, there is less work exploring comparison of patient phenotype profiles for clinical tasks. Moreover, there are no experimental explorations of optimal parameters or better methods in the area.

METHODS: We develop a platform for reproducible benchmarking and comparison of experimental conditions for patient phentoype similarity. Using the platform, we evaluate the task of ranking shared primary diagnosis from uncurated phenotype profiles derived from all text narrative associated with admissions in the medical information mart for intensive care (MIMIC-III).

RESULTS: 300 semantic similarity configurations were evaluated, as well as one embedding-based approach. On average, measures that did not make use of an external information content measure performed slightly better, however the best-performing configurations when measured by area under receiver operating characteristic curve and Top Ten Accuracy used term-specificity and annotation-frequency measures.

CONCLUSION: We identified and interpreted the performance of a large number of semantic similarity configurations for the task of classifying diagnosis from text-derived phenotype profiles in one setting. We also provided a basis for further research on other settings and related tasks in the area.

Original language	English
Article number	33
Number of pages	12
Journal	BMC Medical Informatics and Decision Making
Volume	22
Issue number	1
DOIs	https://doi.org/10.1186/s12911-022-01770-4
Publication status	Published - 5 Feb 2022

Bibliographical note

Keywords

MIMIC-III
differential diagnosis
ontology
semantic similarity
semantic web

Access to Document

10.1186/s12911-022-01770-4Licence: Creative Commons: Attribution (CC BY)

SlaterL2022EvaluaFinal published version, 1.06 MBLicence: Creative Commons: Attribution (CC BY)

Cite this

@article{6b64a2f714094b7abb9373ccb6d527e0,

title = "Evaluating semantic similarity methods for comparison of text-derived phenotype profiles",

abstract = "BACKGROUND: Semantic similarity is a valuable tool for analysis in biomedicine. When applied to phenotype profiles derived from clinical text, they have the capacity to enable and enhance 'patient-like me' analyses, automated coding, differential diagnosis, and outcome prediction. While a large body of work exists exploring the use of semantic similarity for multiple tasks, including protein interaction prediction, and rare disease differential diagnosis, there is less work exploring comparison of patient phenotype profiles for clinical tasks. Moreover, there are no experimental explorations of optimal parameters or better methods in the area.METHODS: We develop a platform for reproducible benchmarking and comparison of experimental conditions for patient phentoype similarity. Using the platform, we evaluate the task of ranking shared primary diagnosis from uncurated phenotype profiles derived from all text narrative associated with admissions in the medical information mart for intensive care (MIMIC-III).RESULTS: 300 semantic similarity configurations were evaluated, as well as one embedding-based approach. On average, measures that did not make use of an external information content measure performed slightly better, however the best-performing configurations when measured by area under receiver operating characteristic curve and Top Ten Accuracy used term-specificity and annotation-frequency measures.CONCLUSION: We identified and interpreted the performance of a large number of semantic similarity configurations for the task of classifying diagnosis from text-derived phenotype profiles in one setting. We also provided a basis for further research on other settings and related tasks in the area.",

keywords = "MIMIC-III, differential diagnosis, ontology, semantic similarity, semantic web",

author = "Slater, {Luke T} and Sophie Russell and Silver Makepeace and Alexander Carberry and Andreas Karwath and Williams, {John A} and Hilary Fanning and Simon Ball and Robert Hoehndorf and Gkoutos, {Georgios V}",

note = "{\textcopyright} 2022. The Author(s).",

year = "2022",

month = feb,

day = "5",

doi = "10.1186/s12911-022-01770-4",

language = "English",

volume = "22",

journal = "BMC Medical Informatics and Decision Making",

issn = "1472-6947",

publisher = "Springer",

number = "1",

}

TY - JOUR

T1 - Evaluating semantic similarity methods for comparison of text-derived phenotype profiles

AU - Slater, Luke T

AU - Russell, Sophie

AU - Makepeace, Silver

AU - Carberry, Alexander

AU - Karwath, Andreas

AU - Williams, John A

AU - Fanning, Hilary

AU - Ball, Simon

AU - Hoehndorf, Robert

AU - Gkoutos, Georgios V

PY - 2022/2/5

Y1 - 2022/2/5

N2 - BACKGROUND: Semantic similarity is a valuable tool for analysis in biomedicine. When applied to phenotype profiles derived from clinical text, they have the capacity to enable and enhance 'patient-like me' analyses, automated coding, differential diagnosis, and outcome prediction. While a large body of work exists exploring the use of semantic similarity for multiple tasks, including protein interaction prediction, and rare disease differential diagnosis, there is less work exploring comparison of patient phenotype profiles for clinical tasks. Moreover, there are no experimental explorations of optimal parameters or better methods in the area.METHODS: We develop a platform for reproducible benchmarking and comparison of experimental conditions for patient phentoype similarity. Using the platform, we evaluate the task of ranking shared primary diagnosis from uncurated phenotype profiles derived from all text narrative associated with admissions in the medical information mart for intensive care (MIMIC-III).RESULTS: 300 semantic similarity configurations were evaluated, as well as one embedding-based approach. On average, measures that did not make use of an external information content measure performed slightly better, however the best-performing configurations when measured by area under receiver operating characteristic curve and Top Ten Accuracy used term-specificity and annotation-frequency measures.CONCLUSION: We identified and interpreted the performance of a large number of semantic similarity configurations for the task of classifying diagnosis from text-derived phenotype profiles in one setting. We also provided a basis for further research on other settings and related tasks in the area.

AB - BACKGROUND: Semantic similarity is a valuable tool for analysis in biomedicine. When applied to phenotype profiles derived from clinical text, they have the capacity to enable and enhance 'patient-like me' analyses, automated coding, differential diagnosis, and outcome prediction. While a large body of work exists exploring the use of semantic similarity for multiple tasks, including protein interaction prediction, and rare disease differential diagnosis, there is less work exploring comparison of patient phenotype profiles for clinical tasks. Moreover, there are no experimental explorations of optimal parameters or better methods in the area.METHODS: We develop a platform for reproducible benchmarking and comparison of experimental conditions for patient phentoype similarity. Using the platform, we evaluate the task of ranking shared primary diagnosis from uncurated phenotype profiles derived from all text narrative associated with admissions in the medical information mart for intensive care (MIMIC-III).RESULTS: 300 semantic similarity configurations were evaluated, as well as one embedding-based approach. On average, measures that did not make use of an external information content measure performed slightly better, however the best-performing configurations when measured by area under receiver operating characteristic curve and Top Ten Accuracy used term-specificity and annotation-frequency measures.CONCLUSION: We identified and interpreted the performance of a large number of semantic similarity configurations for the task of classifying diagnosis from text-derived phenotype profiles in one setting. We also provided a basis for further research on other settings and related tasks in the area.

KW - MIMIC-III

KW - differential diagnosis

KW - ontology

KW - semantic similarity

KW - semantic web

UR - http://www.scopus.com/inward/record.url?scp=85124177059&partnerID=8YFLogxK

U2 - 10.1186/s12911-022-01770-4

DO - 10.1186/s12911-022-01770-4

M3 - Article

C2 - 35123470

SN - 1472-6947

VL - 22

JO - BMC Medical Informatics and Decision Making

JF - BMC Medical Informatics and Decision Making

IS - 1

M1 - 33

ER -

Evaluating semantic similarity methods for comparison of text-derived phenotype profiles

Abstract

Bibliographical note

Keywords

Access to Document

Fingerprint

Cite this