Dual Representation Learning From Fetal Ultrasound Video and Sonographer Audio

Mourad Gridach*, Mohammad Alsharid, Jianbo Jiao, Lior Drukker, Aris T. Papageorghiou, J. Alison Noble

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

39 Downloads (Pure)

Abstract

This paper tackles the challenging problem of real-world data self-supervised representation learning from two modalities: fetal ultrasound (US) video and the corresponding speech acquired when a sonographer performs a pregnancy scan. We propose to transfer knowledge between the different modalities, even though the sonographer's speech and the US video may not be semantically correlated. We design a network architecture capable of learning useful representations such as of anatomical features and structures while recognising the correlation between an US video scan and the sonographer's speech. We introduce dual representation learning from US video and audio, which consists of two concepts: Multi-Modal Contrastive Learning and Multi-Modal Similarity Learning, in a latent feature space. Experiments show that the proposed architecture learns powerful representations and transfers well for two downstream tasks. Furthermore, we experiment with two different datasets for pretraining which differ in size and length of video clips (as well as sonographer speech) to show that the quality of the sonographer's speech plays an important role in the final performance.

Original languageEnglish
Title of host publication2024 IEEE International Symposium on Biomedical Imaging (ISBI)
PublisherIEEE Computer Society Press
ISBN (Electronic)9798350313338
ISBN (Print)9798350313345
DOIs
Publication statusPublished - 22 Aug 2024
Event21st IEEE International Symposium on Biomedical Imaging, ISBI 2024 - Athens, Greece
Duration: 27 May 202430 May 2024

Publication series

NameProceedings - International Symposium on Biomedical Imaging
ISSN (Print)1945-7928
ISSN (Electronic)1945-8452

Conference

Conference21st IEEE International Symposium on Biomedical Imaging, ISBI 2024
Country/TerritoryGreece
CityAthens
Period27/05/2430/05/24

Bibliographical note

Publisher Copyright:
© 2024 IEEE.

Keywords

  • Multi-Modal
  • Self-Supervised
  • Ultrasound

ASJC Scopus subject areas

  • Biomedical Engineering
  • Radiology Nuclear Medicine and imaging

Fingerprint

Dive into the research topics of 'Dual Representation Learning From Fetal Ultrasound Video and Sonographer Audio'. Together they form a unique fingerprint.

Cite this