FreMAE: Fourier Transform Meets Masked Autoencoders for Medical Image Segmentation

Wenxuan Wang; Jing Wang; Chen Chen; Jianbo Jiao; Lichao Sun; Yuanxiu Cai; Shanshan Song; Jiangyun Li

doi:10.48550/arXiv.2304.10864

FreMAE: Fourier Transform Meets Masked Autoencoders for Medical Image Segmentation

Wenxuan Wang, Jing Wang, Chen Chen, Jianbo Jiao, Lichao Sun, Yuanxiu Cai, Shanshan Song, Jiangyun Li

Computer Science

Research output: Working paper/Preprint › Preprint

46 Downloads (Pure)

Abstract

The research community has witnessed the powerful potential of self-supervised Masked Image Modeling (MIM), which enables the models capable of learning visual representation from unlabeled data. In this paper, to incorporate both the crucial global structural information and local details for dense prediction tasks, we alter the perspective to the frequency domain and present a new MIM-based framework named FreMAE for self-supervised pre-training for medical image segmentation. Based on the observations that the detailed structural information mainly lies in the high-frequency components and the high-level semantics are abundant in the low-frequency counterparts, we further incorporate multi-stage supervision to guide the representation learning during the pre-training phase. Extensive experiments on three benchmark datasets show the superior advantage of our proposed FreMAE over previous state-of-the-art MIM methods. Compared with various baselines trained from scratch, our FreMAE could consistently bring considerable improvements to the model performance. To the best our knowledge, this is the first attempt towards MIM with Fourier Transform in medical image segmentation.

Original language	English
Publisher	arXiv
DOIs	https://doi.org/10.48550/arXiv.2304.10864
Publication status	Published - 21 Apr 2023

Access to Document

10.48550/arXiv.2304.10864Licence: Other (please provide link to licence statement

2304.10864v1Other version, 827 KBLicence: Other (please provide link to licence statement

Cite this

@techreport{348b4b67c64b48c78fba821ded5f2f4e,

title = "FreMAE: Fourier Transform Meets Masked Autoencoders for Medical Image Segmentation",

abstract = "The research community has witnessed the powerful potential of self-supervised Masked Image Modeling (MIM), which enables the models capable of learning visual representation from unlabeled data. In this paper, to incorporate both the crucial global structural information and local details for dense prediction tasks, we alter the perspective to the frequency domain and present a new MIM-based framework named FreMAE for self-supervised pre-training for medical image segmentation. Based on the observations that the detailed structural information mainly lies in the high-frequency components and the high-level semantics are abundant in the low-frequency counterparts, we further incorporate multi-stage supervision to guide the representation learning during the pre-training phase. Extensive experiments on three benchmark datasets show the superior advantage of our proposed FreMAE over previous state-of-the-art MIM methods. Compared with various baselines trained from scratch, our FreMAE could consistently bring considerable improvements to the model performance. To the best our knowledge, this is the first attempt towards MIM with Fourier Transform in medical image segmentation.",

author = "Wenxuan Wang and Jing Wang and Chen Chen and Jianbo Jiao and Lichao Sun and Yuanxiu Cai and Shanshan Song and Jiangyun Li",

year = "2023",

month = apr,

day = "21",

doi = "10.48550/arXiv.2304.10864",

language = "English",

publisher = "arXiv",

type = "WorkingPaper",

institution = "arXiv",

}

TY - UNPB

T1 - FreMAE

T2 - Fourier Transform Meets Masked Autoencoders for Medical Image Segmentation

AU - Wang, Wenxuan

AU - Wang, Jing

AU - Chen, Chen

AU - Jiao, Jianbo

AU - Sun, Lichao

AU - Cai, Yuanxiu

AU - Song, Shanshan

AU - Li, Jiangyun

PY - 2023/4/21

Y1 - 2023/4/21

N2 - The research community has witnessed the powerful potential of self-supervised Masked Image Modeling (MIM), which enables the models capable of learning visual representation from unlabeled data. In this paper, to incorporate both the crucial global structural information and local details for dense prediction tasks, we alter the perspective to the frequency domain and present a new MIM-based framework named FreMAE for self-supervised pre-training for medical image segmentation. Based on the observations that the detailed structural information mainly lies in the high-frequency components and the high-level semantics are abundant in the low-frequency counterparts, we further incorporate multi-stage supervision to guide the representation learning during the pre-training phase. Extensive experiments on three benchmark datasets show the superior advantage of our proposed FreMAE over previous state-of-the-art MIM methods. Compared with various baselines trained from scratch, our FreMAE could consistently bring considerable improvements to the model performance. To the best our knowledge, this is the first attempt towards MIM with Fourier Transform in medical image segmentation.

AB - The research community has witnessed the powerful potential of self-supervised Masked Image Modeling (MIM), which enables the models capable of learning visual representation from unlabeled data. In this paper, to incorporate both the crucial global structural information and local details for dense prediction tasks, we alter the perspective to the frequency domain and present a new MIM-based framework named FreMAE for self-supervised pre-training for medical image segmentation. Based on the observations that the detailed structural information mainly lies in the high-frequency components and the high-level semantics are abundant in the low-frequency counterparts, we further incorporate multi-stage supervision to guide the representation learning during the pre-training phase. Extensive experiments on three benchmark datasets show the superior advantage of our proposed FreMAE over previous state-of-the-art MIM methods. Compared with various baselines trained from scratch, our FreMAE could consistently bring considerable improvements to the model performance. To the best our knowledge, this is the first attempt towards MIM with Fourier Transform in medical image segmentation.

U2 - 10.48550/arXiv.2304.10864

DO - 10.48550/arXiv.2304.10864

M3 - Preprint

BT - FreMAE

PB - arXiv

ER -

FreMAE: Fourier Transform Meets Masked Autoencoders for Medical Image Segmentation

Abstract

Access to Document

Fingerprint

Cite this