中文版 | English
Title

Synthesizing Talking Face Videos with a Spatial Attention Mechanism

Author
Corresponding AuthorYu, Shiqi
DOI
Publication Years
2022
Conference Name
16th Chinese Conference on Biometric Recognition, CCBR 2022
ISSN
0302-9743
EISSN
1611-3349
ISBN
9783031202322
Source Title
Volume
13628 LNCS
Pages
519-528
Conference Date
November 11, 2022 - November 13, 2022
Conference Place
Beijing, China
Publisher
Abstract
Recently, talking face generation has drawn considerable attention of researchers due to its wide applications. The lip synchronization accuracy and visual quality of the generated target speaker are very crucial for synthesizing photo-realistic talking face videos. Prior methods usually obtained unnatural and incongruous results. Or the generated ones comparatively has high fidelity, but only for a specific target speaker. In this paper, we propose a novel adversarial learning framework for talking face generation of arbitrary target speakers. To sufficiently provide visual information about the lip region in the video synthesis process, we introduce a spatial attention mechanism enabling our model to pay more attention to the lip region construction. In addition, we employ a content loss and a total variation regularization for our objective function in order to reduce lip shaking and artifacts in the deformed regions. Extensive experiments demonstrate that our method outperforms other representative approaches.
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
SUSTech Authorship
First ; Corresponding
Language
English
Indexed By
EI Accession Number
20225213295756
ESI Classification Code
Artificial Intelligence:723.4
Data Source
EV Compendex
Citation statistics
Cited Times [WOS]:0
Document TypeConference paper
Identifierhttp://kc.sustech.edu.cn/handle/2SGJ60CL/519713
DepartmentDepartment of Computer Science and Engineering
Affiliation
1.Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen; 518055, China
2.Ping An Technology, Shenzhen, China
First Author AffilicationDepartment of Computer Science and Engineering
Corresponding Author AffilicationDepartment of Computer Science and Engineering
First Author's First AffilicationDepartment of Computer Science and Engineering
Recommended Citation
GB/T 7714
Wang, Ting,Zhou, Chaoyong,Yu, Shiqi. Synthesizing Talking Face Videos with a Spatial Attention Mechanism[C]:Springer Science and Business Media Deutschland GmbH,2022:519-528.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Export to Excel
Export to Csv
Altmetrics Score
Google Scholar
Similar articles in Google Scholar
[Wang, Ting]'s Articles
[Zhou, Chaoyong]'s Articles
[Yu, Shiqi]'s Articles
Baidu Scholar
Similar articles in Baidu Scholar
[Wang, Ting]'s Articles
[Zhou, Chaoyong]'s Articles
[Yu, Shiqi]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Wang, Ting]'s Articles
[Zhou, Chaoyong]'s Articles
[Yu, Shiqi]'s Articles
Terms of Use
No data!
Social Bookmark/Share
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.