Projects per year
Abstract
This paper tackles the challenging problem of real-world data self-supervised representation learning from two modalities: fetal ultrasound (US) video and the corresponding speech acquired when a sonographer performs a pregnancy scan. We propose to transfer knowledge between the different modalities, even though the sonographer's speech and the US video may not be semantically correlated. We design a network architecture capable of learning useful representations such as of anatomical features and structures while recognising the correlation between an US video scan and the sonographer's speech. We introduce dual representation learning from US video and audio, which consists of two concepts: Multi-Modal Contrastive Learning and Multi-Modal Similarity Learning, in a latent feature space. Experiments show that the proposed architecture learns powerful representations and transfers well for two downstream tasks. Furthermore, we experiment with two different datasets for pretraining which differ in size and length of video clips (as well as sonographer speech) to show that the quality of the sonographer's speech plays an important role in the final performance.
Original language | English |
---|---|
Title of host publication | 2024 IEEE International Symposium on Biomedical Imaging (ISBI) |
Publisher | IEEE Computer Society Press |
ISBN (Electronic) | 9798350313338 |
ISBN (Print) | 9798350313345 |
DOIs | |
Publication status | Published - 22 Aug 2024 |
Event | 21st IEEE International Symposium on Biomedical Imaging, ISBI 2024 - Athens, Greece Duration: 27 May 2024 → 30 May 2024 |
Publication series
Name | Proceedings - International Symposium on Biomedical Imaging |
---|---|
ISSN (Print) | 1945-7928 |
ISSN (Electronic) | 1945-8452 |
Conference
Conference | 21st IEEE International Symposium on Biomedical Imaging, ISBI 2024 |
---|---|
Country/Territory | Greece |
City | Athens |
Period | 27/05/24 → 30/05/24 |
Bibliographical note
Publisher Copyright:© 2024 IEEE.
Keywords
- Multi-Modal
- Self-Supervised
- Ultrasound
ASJC Scopus subject areas
- Biomedical Engineering
- Radiology Nuclear Medicine and imaging
Fingerprint
Dive into the research topics of 'Dual Representation Learning From Fetal Ultrasound Video and Sonographer Audio'. Together they form a unique fingerprint.Projects
- 1 Finished
-
CLRM3D: Continual Large-scale Representation Learning from Multi-Modal Medical Data
Jiao, J. (Principal Investigator)
18/04/23 → 17/04/25
Project: Research Councils