Machine learning utilising spectral derivative data improves cellular health classification through hyperspectral infra-red spectroscopy

Research output: Contribution to journalArticlepeer-review


  • Ben O. L. Mellors
  • Abigail M. Spear
  • Christopher R. Howle
  • Kelly Curtis
  • Sarah Macildowie

Colleges, School and Institutes

External organisations

  • Defence Science and Technology Laboratory, Porton Down, Salisbury, United Kingdom


The objective differentiation of facets of cellular metabolism is important for several clinical applications, including accurate definition of tumour boundaries and targeted wound debridement. To this end, spectral biomarkers to differentiate live and necrotic/apoptotic cells have been defined using in vitro methods. The delineation of different cellular states using spectroscopic methods is difficult due to the complex nature of these biological processes. Sophisticated, objective classification methods will therefore be important for such differentiation. In this study, spectral data from healthy/traumatised cell samples using hyperspectral imaging between 2500-3500 nm were collected using a portable prototype device. Machine learning algorithms, in the form of clustering, have been performed on a variety of pre-processing data types including ‘raw’ unprocessed, smoothed resampling, background subtracted and spectral derivative. The resulting clusters were utilised as a diagnostic tool for the assessment of cellular health and quantified using both sensitivity and specificity to compare the different analysis methods. The raw data exhibited differences for one of the three different trauma types applied, although unable to accurately cluster all the traumatised samples due to signal contamination from the chemical insult. The background subtracted and smoothed data sets reduced the accuracy further, due to the apparent removal of key spectral features which exhibit cellular health. However, the spectral derivative data-types significantly improved the accuracy of clustering compared to other data types, with both sensitivity and specificity for the background subtracted data set being >94% highlighting its utility to account for unknown signal contamination while maintaining important cellular spectral features.


Original languageEnglish
Article numbere0238647
Number of pages21
JournalPLOS One
Issue number9
Publication statusPublished - 15 Sep 2020


  • Cluster Analysis, Contrast Media/chemistry, Fibroblasts/cytology, Humans, Image Processing, Computer-Assisted, Machine Learning, Necrosis, Principal Component Analysis, Spectrophotometry, Infrared