Disentangled Pre-training for Image Matting

Yanda Li; Zilong Huang; Gang Yu; Ling Chen; Yunchao Wei; Jianbo Jiao

doi:10.48550/arXiv.2304.00784

Disentangled Pre-training for Image Matting

Yanda Li, Zilong Huang, Gang Yu, Ling Chen, Yunchao Wei, Jianbo Jiao

Computer Science

Research output: Working paper/Preprint › Preprint

40 Downloads (Pure)

Abstract

Image matting requires high-quality pixel-level human annotations to support the training of a deep model in recent literature. Whereas such annotation is costly and hard to scale, significantly holding back the development of the research. In this work, we make the first attempt towards addressing this problem, by proposing a self-supervised pre-training approach that can leverage infinite numbers of data to boost the matting performance. The pre-training task is designed in a similar manner as image matting, where random trimap and alpha matte are generated to achieve an image disentanglement objective. The pre-trained model is then used as an initialisation of the downstream matting task for fine-tuning. Extensive experimental evaluations show that the proposed approach outperforms both the state-of-the-art matting methods and other alternative self-supervised initialisation approaches by a large margin. We also show the robustness of the proposed approach over different backbone architectures. The code and models will be publicly available.

Original language	English
Publisher	arXiv
DOIs	https://doi.org/10.48550/arXiv.2304.00784
Publication status	Published - 3 Apr 2023

Access to Document

10.48550/arXiv.2304.00784Licence: Other (please provide link to licence statement

2304.00784v1Other version, 7.84 MBLicence: Other (please provide link to licence statement

Cite this

@techreport{0dba149db9474935bafbe93741fe8ff8,

title = "Disentangled Pre-training for Image Matting",

abstract = "Image matting requires high-quality pixel-level human annotations to support the training of a deep model in recent literature. Whereas such annotation is costly and hard to scale, significantly holding back the development of the research. In this work, we make the first attempt towards addressing this problem, by proposing a self-supervised pre-training approach that can leverage infinite numbers of data to boost the matting performance. The pre-training task is designed in a similar manner as image matting, where random trimap and alpha matte are generated to achieve an image disentanglement objective. The pre-trained model is then used as an initialisation of the downstream matting task for fine-tuning. Extensive experimental evaluations show that the proposed approach outperforms both the state-of-the-art matting methods and other alternative self-supervised initialisation approaches by a large margin. We also show the robustness of the proposed approach over different backbone architectures. The code and models will be publicly available.",

author = "Yanda Li and Zilong Huang and Gang Yu and Ling Chen and Yunchao Wei and Jianbo Jiao",

year = "2023",

month = apr,

day = "3",

doi = "10.48550/arXiv.2304.00784",

language = "English",

publisher = "arXiv",

type = "WorkingPaper",

institution = "arXiv",

}

TY - UNPB

T1 - Disentangled Pre-training for Image Matting

AU - Li, Yanda

AU - Huang, Zilong

AU - Yu, Gang

AU - Chen, Ling

AU - Wei, Yunchao

AU - Jiao, Jianbo

PY - 2023/4/3

Y1 - 2023/4/3

N2 - Image matting requires high-quality pixel-level human annotations to support the training of a deep model in recent literature. Whereas such annotation is costly and hard to scale, significantly holding back the development of the research. In this work, we make the first attempt towards addressing this problem, by proposing a self-supervised pre-training approach that can leverage infinite numbers of data to boost the matting performance. The pre-training task is designed in a similar manner as image matting, where random trimap and alpha matte are generated to achieve an image disentanglement objective. The pre-trained model is then used as an initialisation of the downstream matting task for fine-tuning. Extensive experimental evaluations show that the proposed approach outperforms both the state-of-the-art matting methods and other alternative self-supervised initialisation approaches by a large margin. We also show the robustness of the proposed approach over different backbone architectures. The code and models will be publicly available.

AB - Image matting requires high-quality pixel-level human annotations to support the training of a deep model in recent literature. Whereas such annotation is costly and hard to scale, significantly holding back the development of the research. In this work, we make the first attempt towards addressing this problem, by proposing a self-supervised pre-training approach that can leverage infinite numbers of data to boost the matting performance. The pre-training task is designed in a similar manner as image matting, where random trimap and alpha matte are generated to achieve an image disentanglement objective. The pre-trained model is then used as an initialisation of the downstream matting task for fine-tuning. Extensive experimental evaluations show that the proposed approach outperforms both the state-of-the-art matting methods and other alternative self-supervised initialisation approaches by a large margin. We also show the robustness of the proposed approach over different backbone architectures. The code and models will be publicly available.

U2 - 10.48550/arXiv.2304.00784

DO - 10.48550/arXiv.2304.00784

M3 - Preprint

BT - Disentangled Pre-training for Image Matting

PB - arXiv

ER -

Disentangled Pre-training for Image Matting

Abstract

Access to Document

Fingerprint

Cite this