中文版 | English
Title

An efficient multi-path structure with staged connection and multi-scale mechanism for text-to-image synthesis

Author
Corresponding AuthorGuo, Huanlei
Publication Years
2023
DOI
Source Title
ISSN
0942-4962
EISSN
1432-1882
Abstract
Generating a realistic image which matches the given text description is a challenging task. The multi-stage framework obtains the high-resolution image by constructing a low-resolution image firstly, which is widely adopted for text-to-image synthesis task. However, subsequent stages of existing generator have to construct the whole image repeatedly, while the primitive features of the objects have been sketched out in the previously adjacent stage. In order to make the subsequent stages focus on enriching fine-grained details and improve the quality of the final generated image, an efficient multi-path structure is proposed for multi-stage framework in this paper. The proposed structure contains two parts: staged connection and multi-scale module. Staged connection is employed to transfer the feature maps of the generated image from previously adjacent stage to the end of current stage. Such path can avoid the requirement of long-term memory and guide the network focus on modifying and supplementing the details of generated image. In addition, the multi-scale module is explored to extract feature at different scales and generate image with more fine-grained details. The proposed multi-path structure can be introduced to multi-stage based algorithm such as StackGAN-v2 and AttnGAN. Extensive experiments are conducted on two widely used datasets, i.e. Oxford-102 and CUB dataset, for the text-to-image synthesis task. The results demonstrate the superior performance of the methods with multi-path structure over the base models.
© 2023, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
Indexed By
EI ; SCI
Language
English
SUSTech Authorship
Corresponding
Funding Project
The authors acknowledge the financial support from the Fundamental Research Funds for the Provincial Universities of Zhejiang (Grant No. GK219909299001-015), Natural Science Foundation of China (Grant No. 62206082), National Undergraduate Training Program for Innovation and Entrepreneurship (Grant No. 202110336042), Planted talent plan (Grant No. 2022R407A002) and Research on higher teaching reform (YBJG202233).
WOS Accession No
WOS:000939646100001
Publisher
EI Accession Number
20230913650700
EI Keywords
Software engineering
ESI Classification Code
Computer Programming:723.1
ESI Research Field
COMPUTER SCIENCE
Data Source
EV Compendex
Citation statistics
Cited Times [WOS]:0
Document TypeJournal Article
Identifierhttp://kc.sustech.edu.cn/handle/2SGJ60CL/519650
DepartmentDepartment of Statistics and Data Science
Affiliation
1.Computer and Software School, Hangzhou Dianzi University, Hangzhou; 310018, China
2.Department of Statistics and Data Science, Southern University of Science and Technology, Shenzhen; 518055, China
3.Zhuoyue Honors College, Hangzhou Dianzi University, Hangzhou; 310018, China
4.Hangzhou oke Technology Co Ltd, Hangzhou; 310000, China
5.Hangzhou Dianzi University Shangyu Institute of Science and Engineering, Shangyu; 312300, China
Corresponding Author AffilicationDepartment of Statistics and Data Science
Recommended Citation
GB/T 7714
Ding, Jiajun,Liu, Beili,Yu, Jun,et al. An efficient multi-path structure with staged connection and multi-scale mechanism for text-to-image synthesis[J]. MULTIMEDIA SYSTEMS,2023.
APA
Ding, Jiajun,Liu, Beili,Yu, Jun,Guo, Huanlei,Shen, Ming,&Shen, Kenong.(2023).An efficient multi-path structure with staged connection and multi-scale mechanism for text-to-image synthesis.MULTIMEDIA SYSTEMS.
MLA
Ding, Jiajun,et al."An efficient multi-path structure with staged connection and multi-scale mechanism for text-to-image synthesis".MULTIMEDIA SYSTEMS (2023).
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Export to Excel
Export to Csv
Altmetrics Score
Google Scholar
Similar articles in Google Scholar
[Ding, Jiajun]'s Articles
[Liu, Beili]'s Articles
[Yu, Jun]'s Articles
Baidu Scholar
Similar articles in Baidu Scholar
[Ding, Jiajun]'s Articles
[Liu, Beili]'s Articles
[Yu, Jun]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Ding, Jiajun]'s Articles
[Liu, Beili]'s Articles
[Yu, Jun]'s Articles
Terms of Use
No data!
Social Bookmark/Share
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.