中文版 | English
Title

A Dual-branch Convolutional Network Architecture Processing on both Frequency and Time Domain for Single-channel Speech Enhancement

Author
Corresponding AuthorZhang, Xueliang
Publication Years
2023
DOI
Source Title
ISSN
2048-7703
Volume12Issue:3
Abstract
Single-channel speech enhancement aims to remove the interfering noise and reverberation in real environments by a single microphone, which is a very challenging task in the speech signal processing field. Over the past years, deep learning has shown great potential for speech enhancement. In this paper, we propose a novel real-time framework, called DBCN, which is a dual-branch architecture. One branch takes waveform as its input for time-domain modeling and the other one takes shift real spectrum as input for frequency-domain modeling. The two branches have the same network structure, which is the representative convolutional recurrent network. To exchange information sufficiently, a bridge module is added between the two branches. Furthermore, we propose a novel feature normalization approach that enables each band to complete the normalization independently by counting the root mean square of each band and obtaining the inter-frame relationship for each band. The proposed approach allows the network to ignore the magnitude during processing, reducing learning difficulty and improving performance. Systematical evaluation and comparison are conducted. Experimental results show that the proposed system substantially outperforms related algorithms for causal and non-causal speech enhancement under very challenging environments.
Keywords
URL[Source Record]
Indexed By
Language
English
SUSTech Authorship
Others
Funding Project
China National Nature Science Foundation["61876214","KF-2022-07-009"]
WOS Research Area
Engineering
WOS Subject
Engineering, Electrical & Electronic
WOS Accession No
WOS:001030577500006
Publisher
Data Source
Web of Science
Citation statistics
Document TypeJournal Article
Identifierhttp://kc.sustech.edu.cn/handle/2SGJ60CL/583054
DepartmentDepartment of Electrical and Electronic Engineering
Affiliation
1.Inner Mongolia Univ, Coll Comp Sci, Hohhot, Peoples R China
2.Southern Univ Sci & Technol, Dept Elect & Elect Engn, Shenzhen, Peoples R China
Recommended Citation
GB/T 7714
Zhang, Kanghao,He, Shulin,Li, Hao,et al. A Dual-branch Convolutional Network Architecture Processing on both Frequency and Time Domain for Single-channel Speech Enhancement[J]. APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING,2023,12(3).
APA
Zhang, Kanghao,He, Shulin,Li, Hao,&Zhang, Xueliang.(2023).A Dual-branch Convolutional Network Architecture Processing on both Frequency and Time Domain for Single-channel Speech Enhancement.APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING,12(3).
MLA
Zhang, Kanghao,et al."A Dual-branch Convolutional Network Architecture Processing on both Frequency and Time Domain for Single-channel Speech Enhancement".APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING 12.3(2023).
Files in This Item:
There are no files associated with this item.
Related Services
Fulltext link
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Export to Excel
Export to Csv
Altmetrics Score
Google Scholar
Similar articles in Google Scholar
[Zhang, Kanghao]'s Articles
[He, Shulin]'s Articles
[Li, Hao]'s Articles
Baidu Scholar
Similar articles in Baidu Scholar
[Zhang, Kanghao]'s Articles
[He, Shulin]'s Articles
[Li, Hao]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Zhang, Kanghao]'s Articles
[He, Shulin]'s Articles
[Li, Hao]'s Articles
Terms of Use
No data!
Social Bookmark/Share
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.