中文版 | English
Title

A Segmented Auditory Attention Decoding Model and Its Application to Neurofeedback Based Target Speech Perception

Alternative Title
分段注意力解码方法及其在基于神经反馈的目标语音感知中的应用
Author
School number
11650012
Degree
博士
Discipline
电子与电气工程
Subject category of dissertation
电子与电气工程
Supervisor
陈霏
Tutor of External Organizations
Ed X. Wu
Publication Years
2022-03-09
Submission date
2022-03-11
University
香港大学
Place of Publication
香港
Abstract

Human listeners can perceive the target speech stream effortlessly in the complex auditory scenarios. Neuroimaging technologies such as electroencephalography (EEG) have been widely used to understand the neural mechanism of target speech perception and to decode the auditory attention modulation patterns in the complex auditory scenes. Previous behavioral and neurological studies demonstrated that target speech perception depends on the regularly hierarchical structures, nevertheless, little is known about the interactions between auditory attention modulation and different speech segments to the target speech perception. This doctoral study mainly aimed to reveal the underlying mechanism of auditory attention modulation for different root-mean-square (RMS)-level-based speech segments, and to develop advanced auditory attention decoding (AAD) methods to further help target speech perception in the complex auditory scenarios. Firstly, the contribution of different RMS-level-based segments to speech perception was examined through related behavioral and neurological tests. Behavioral results showed that different RMS-level-based segments carrying distinct information played different roles in speech intelligibility. Besides, neurological experiments demonstrated that each type of RMS-level-based speech segment elicited a specific cortical response pattern with the auditory attention modulation, indicating that the target speech perception was jointly affected by different types of RMS-level-based segments and auditory attention modulation. These findings provided new perspectives to understand the speech perception mechanisms of the auditory attention modulation in the complex auditory scenes. Following that, an effective speech-RMS-level-based segmented AAD model was proposed to promote the AAD performance in a wide range of signal-to-masker ratios (SMRs). The proposed segmented AAD model consisted of three steps. First, a support vector machine classifier was used to predict the perceived auditory stimuli belonging to higher- or lower-RMS-level-based speech segments through the corresponding EEG signals. Subsequently, the speech envelope was reconstructed using the specific AAD model in each type of speech segment. Lastly, the target speech was determined by comparing the correlation coefficients between the original and reconstructed speech envelopes. Compared to the traditional unified AAD model, which did not separate the functional roles of higher- or lower-RMS-level-based speech segments in AAD, the proposed segmented computational method significantly improved the AAD accuracy even under low SMR levels and with the short decoding window lengths. Lastly, the proposal segmented AAD model was further combined with advanced speech processing algorithms to develop an intention-adaptive speech signal processing system in the competing-speaker environments. In order to apply such a neurofeedback-based speech signal processing system in the real-life scenes, subjects were required to focus or switch their attention between the competing speakers according to the experimental requirements. Results showed that the cortical tracking ability to the target speech streams could be a reliable biomarker to reflect dynamics of auditory attention states. The neurofeedback-based intention-adaptive system could facilitate the target speech perception under the different SMRs when the auditory attention was dynamically switched from one to the other speaker stream. These findings indicated that the neurofeedback-based speech separation system has the potential to improve target speech perception in the complex auditory scenes

Keywords
Language
English
Training classes
联合培养
Document TypeThesis
Identifierhttp://kc.sustech.edu.cn/handle/2SGJ60CL/406323
DepartmentDepartment of Electrical and Electronic Engineering
Recommended Citation
GB/T 7714
Wang L. A Segmented Auditory Attention Decoding Model and Its Application to Neurofeedback Based Target Speech Perception[D]. 香港. 香港大学,2022.
Files in This Item:
File Name/Size DocType Version Access License
A Segmented Auditory(10684KB) Restricted Access--Fulltext Requests
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Export to Excel
Export to Csv
Altmetrics Score
Google Scholar
Similar articles in Google Scholar
[王蕾]'s Articles
Baidu Scholar
Similar articles in Baidu Scholar
[王蕾]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[王蕾]'s Articles
Terms of Use
No data!
Social Bookmark/Share
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.