Title | Uni4Eye: Unified 2D and 3D Self-supervised Pre-training via Masked Image Modeling Transformer for Ophthalmic Image Classification |
Author | |
Corresponding Author | Tang,Xiaoying |
DOI | |
Publication Years | 2022
|
Conference Name | 25th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI)
|
ISSN | 0302-9743
|
EISSN | 1611-3349
|
ISBN | 978-3-031-16451-4
|
Source Title | |
Pages | 88-98
|
Conference Date | SEP 18-22, 2022
|
Conference Place | null,Singapore,SINGAPORE
|
Publication Place | GEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND
|
Publisher | |
Abstract | A large-scale labeled dataset is a key factor for the success of supervised deep learning in computer vision. However, a limited number of annotated data is very common, especially in ophthalmic image analysis, since manual annotation is time-consuming and labor-intensive. Self-supervised learning (SSL) methods bring huge opportunities for better utilizing unlabeled data, as they do not need massive annotations. With an attempt to use as many as possible unlabeled ophthalmic images, it is necessary to break the dimension barrier, simultaneously making use of both 2D and 3D images. In this paper, we propose a universal self-supervised Transformer framework, named Uni4Eye, to discover the inherent image property and capture domain-specific feature embedding in ophthalmic images. Uni4Eye can serve as a global feature extractor, which builds its basis on a Masked Image Modeling task with a Vision Transformer (ViT) architecture. We employ a Unified Patch Embedding module to replace the origin patch embedding module in ViT for jointly processing both 2D and 3D input images. Besides, we design a dual-branch multitask decoder module to simultaneously perform two reconstruction tasks on the input image and its gradient map, delivering discriminative representations for better convergence. We evaluate the performance of our pre-trained Uni4Eye encoder by fine-tuning it on six downstream ophthalmic image classification tasks. The superiority of Uni4Eye is successfully established through comparisons to other state-of-the-art SSL pre-training methods. |
Keywords | |
SUSTech Authorship | First
; Corresponding
|
Language | English
|
URL | [Source Record] |
Indexed By | |
Funding Project | Shenzhen Basic Research Program[JCYJ20200925153847004]
; National Natural Science Foundation of China[62071210]
; Shenzhen Science and Technology Program[RCYX2021060910305 6042]
|
WOS Research Area | Computer Science
|
WOS Subject | Computer Science, Artificial Intelligence
; Computer Science, Theory & Methods
|
WOS Accession No | WOS:000867418200009
|
Scopus EID | 2-s2.0-85139010610
|
Data Source | Scopus
|
Citation statistics |
Cited Times [WOS]:0
|
Document Type | Conference paper |
Identifier | http://kc.sustech.edu.cn/handle/2SGJ60CL/406279 |
Department | Department of Electrical and Electronic Engineering |
Affiliation | 1.Department of Electronic and Electrical Engineering,Southern University of Science and Technology,Shenzhen,China 2.Department of Electrical and Electronic Engineering,The University of Hong Kong,Pok Fu Lam,Hong Kong |
First Author Affilication | Department of Electrical and Electronic Engineering |
Corresponding Author Affilication | Department of Electrical and Electronic Engineering |
First Author's First Affilication | Department of Electrical and Electronic Engineering |
Recommended Citation GB/T 7714 |
Cai,Zhiyuan,Lin,Li,He,Huaqing,et al. Uni4Eye: Unified 2D and 3D Self-supervised Pre-training via Masked Image Modeling Transformer for Ophthalmic Image Classification[C]. GEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND:SPRINGER INTERNATIONAL PUBLISHING AG,2022:88-98.
|
Files in This Item: | There are no files associated with this item. |
|
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment