A sentence classification framework to identify geometric errors in radiation therapy from relevant literature

Tanmay Basu, Simon Goldsworthy, Georgios V. Gkoutos

Research output: Contribution to journalArticlepeer-review

99 Downloads (Pure)

Abstract

The objective of systematic reviews is to address a research question by summarizing relevant studies following a detailed, comprehensive, and transparent plan and search protocol to reduce bias. Systematic reviews are very useful in the biomedical and healthcare domain; however, the data extraction phase of the systematic review process necessitates substantive expertise and is labour-intensive and time-consuming. The aim of this work is to partially automate the process of building systematic radiotherapy treatment literature reviews by summarizing the required data elements of geometric errors of radiotherapy from relevant literature using machine learning and natural language processing (NLP) approaches. A framework is developed in this study that initially builds a training corpus by extracting sentences containing different types of geometric errors of radiotherapy from relevant publications. The publications are retrieved from PubMed following a given set of rules defined by a domain expert. Subsequently, the method develops a training corpus by extracting relevant sentences using a sentence similarity measure. A support vector machine (SVM) classifier is then trained on this training corpus to extract the sentences from new publications which contain relevant geometric errors. To demonstrate the proposed approach, we have used 60 publications containing geometric errors in radiotherapy to automatically extract the sentences stating the mean and standard deviation of different types of errors between planned and executed radiotherapy. The experimental results show that the recall and precision of the proposed framework are, respectively, 97% and 72%. The results clearly show that the framework is able to extract almost all sentences containing required data of geometric errors.

Original languageEnglish
Article number139
JournalInformation
Volume12
Issue number4
Early online date24 Mar 2021
DOIs
Publication statusPublished - Apr 2021

Bibliographical note

Funding Information:
This work was directly supported by the MRC Heath Data Research UK (HDRUK/CFC/01), an initiative funded by UK Research and Innovation, Department of Health and Social Care (England) and the devolved administrations, and leading medical research charities. The views expressed in this publication are those of the authors and not necessarily those of the NHS, the National Institute for Health Research, the Medical Research Council or the Department of Health. Georgios V. Gk-outos also acknowledges support from the NIHR Birmingham ECMC, NIHR Birmingham SRMRC, Nanocommons H2020-EU (731032) and the NIHR Birmingham Biomedical Research Centre.

Funding Information:
Funding: This work was directly supported by the MRC Heath Data Research UK (HDRUK/CFC/01), an initiative funded by UK Research and Innovation, Department of Health and Social Care (England) and the devolved administrations, and leading medical research charities. The views expressed in this publication are those of the authors and not necessarily those of the NHS, the National Institute for Health Research, the Medical Research Council or the Department of Health. Georgios V. Gk-outos also acknowledges support from the NIHR Birmingham ECMC, NIHR Birmingham SRMRC, Nanocommons H2020-EU (731032) and the NIHR Birmingham Biomedical Research Centre.

Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.

Keywords

  • Geometric error
  • Health informatics
  • Information extraction
  • Machine learning
  • NLP
  • Radio-therapy
  • Text mining

ASJC Scopus subject areas

  • Information Systems

Fingerprint

Dive into the research topics of 'A sentence classification framework to identify geometric errors in radiation therapy from relevant literature'. Together they form a unique fingerprint.

Cite this