Deep convolutional neural networks outperform vanilla machine learning when predicting language outcomes after stroke

  • Thomas M.H. Hope*
  • , Howard Bowman
  • , Alex P. Leff
  • , Cathy J. Price
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

33 Downloads (Pure)

Abstract

BACKGROUND: Current medicine cannot confidently predict patients' language skills after stroke. In recent years, researchers have sought to bridge this gap with machine learning. These models appear to benefit from access to features describing where and how much brain damage these patients have suffered. Given the very high dimensionality of structural brain imaging data, those brain lesion features are typically post-processed from the images themselves into tabular features. With the introduction of deep Convolutional Neural Networks (CNN), which appear to be much more robust to high dimensional data, it is natural to hope that much of this image post-processing might be unnecessary. But prior attempts to demonstrate this (in the area of post-stroke prognostics) have so far yielded only equivocal results - perhaps because the datasets that those studies could deploy were too small to properly constrain CNNs, which are famously 'data-hungry'.

METHODS: The study draws on a much larger dataset than has been employed in previous work like this, referring to patients whose language outcomes were assessed once during the chronic phase post-stroke, on or around the same days as they underwent high resolution MRI brain scans. Following the model of our own and others' past work, we use state of the art 'vanilla' machine learning models (boosted ensembles) to predict a variety of language and cognitive outcomes scores. These models employ both demographic variables and features derived from the brain imaging data, which represent where brain damage has occurred. These are our baseline models. Next, we use deep CNNs to predict the same language scores for the same patients, drawing on both the demographic variables, and post-processed brain lesion images: i.e., multi-input models with one input for tabular features and another for 3-dimensional images. We compare the models using 5 × 2-fold cross-validation, with consistent folds.

RESULTS: The CNN models consistently outperform the vanilla machine learning models, in this domain.

CONCLUSIONS: Deep CNNs offer state of the art performance when predicting language outcomes after stroke, outperforming vanilla machine learning and obviating the need to post-process lesion images into lesion features.

Original languageEnglish
Article number103880
Number of pages6
JournalNeuroImage: Clinical
Volume48
DOIs
Publication statusPublished - 29 Sept 2025

Bibliographical note

Copyright © 2025 The Author(s). Published by Elsevier Inc.

Keywords

  • Stroke
  • Language
  • Cognition
  • Machine learning
  • Lesions
  • Deep learning

Fingerprint

Dive into the research topics of 'Deep convolutional neural networks outperform vanilla machine learning when predicting language outcomes after stroke'. Together they form a unique fingerprint.

Cite this