A global review of publicly available datasets for ophthalmological imaging: barriers to access, usability, and generalisability

Research output: Contribution to journalArticlepeer-review

Authors

  • Saad M Khan
  • Xiaoxuan Liu
  • Siddharth Nath
  • Edward Korot
  • Livia Faes
  • Siegfried K Wagner
  • Pearse A Keane
  • Neil J Sebire
  • Matthew J Burton

Colleges, School and Institutes

Abstract

Health data that are publicly available are valuable resources for digital health research. Several public datasets containing ophthalmological imaging have been frequently used in machine learning research; however, the total number of datasets containing ophthalmological health information and their respective content is unclear. This Review aimed to identify all publicly available ophthalmological imaging datasets, detail their accessibility, describe which diseases and populations are represented, and report on the completeness of the associated metadata. With the use of MEDLINE, Google's search engine, and Google Dataset Search, we identified 94 open access datasets containing 507 724 images and 125 videos from 122 364 patients. Most datasets originated from Asia, North America, and Europe. Disease populations were unevenly represented, with glaucoma, diabetic retinopathy, and age-related macular degeneration disproportionately overrepresented in comparison with other eye diseases. The reporting of basic demographic characteristics such as age, sex, and ethnicity was poor, even at the aggregate level. This Review provides greater visibility for ophthalmological datasets that are publicly available as powerful resources for research. Our paper also exposes an increasing divide in the representation of different population and disease groups in health data repositories. The improved reporting of metadata would enable researchers to access the most appropriate datasets for their needs and maximise the potential of such resources.

Details

Original languageEnglish
Pages (from-to)e51-e66
JournalThe Lancet Digital Health
Volume3
Issue number1
Early online date1 Oct 2020
Publication statusPublished - 1 Jan 2021