Title | A Segmented Auditory Attention Decoding Model and Its Application to Neurofeedback Based Target Speech Perception |
Alternative Title | 分段注意力解码方法及其在基于神经反馈的目标语音感知中的应用
|
Author | |
School number | 11650012
|
Degree | 博士
|
Discipline | 电子与电气工程
|
Subject category of dissertation | 电子与电气工程
|
Supervisor | |
Tutor of External Organizations | Ed X. Wu
|
Publication Years | 2022-03-09
|
Submission date | 2022-03-11
|
University | 香港大学
|
Place of Publication | 香港
|
Abstract | Human listeners can perceive the target speech stream effortlessly in the complex auditory scenarios. Neuroimaging technologies such as electroencephalography (EEG) have been widely used to understand the neural mechanism of target speech perception and to decode the auditory attention modulation patterns in the complex auditory scenes. Previous behavioral and neurological studies demonstrated that target speech perception depends on the regularly hierarchical structures, nevertheless, little is known about the interactions between auditory attention modulation and different speech segments to the target speech perception. This doctoral study mainly aimed to reveal the underlying mechanism of auditory attention modulation for different root-mean-square (RMS)-level-based speech segments, and to develop advanced auditory attention decoding (AAD) methods to further help target speech perception in the complex auditory scenarios. Firstly, the contribution of different RMS-level-based segments to speech perception was examined through related behavioral and neurological tests. Behavioral results showed that different RMS-level-based segments carrying distinct information played different roles in speech intelligibility. Besides, neurological experiments demonstrated that each type of RMS-level-based speech segment elicited a specific cortical response pattern with the auditory attention modulation, indicating that the target speech perception was jointly affected by different types of RMS-level-based segments and auditory attention modulation. These findings provided new perspectives to understand the speech perception mechanisms of the auditory attention modulation in the complex auditory scenes. Following that, an effective speech-RMS-level-based segmented AAD model was proposed to promote the AAD performance in a wide range of signal-to-masker ratios (SMRs). The proposed segmented AAD model consisted of three steps. First, a support vector machine classifier was used to predict the perceived auditory stimuli belonging to higher- or lower-RMS-level-based speech segments through the corresponding EEG signals. Subsequently, the speech envelope was reconstructed using the specific AAD model in each type of speech segment. Lastly, the target speech was determined by comparing the correlation coefficients between the original and reconstructed speech envelopes. Compared to the traditional unified AAD model, which did not separate the functional roles of higher- or lower-RMS-level-based speech segments in AAD, the proposed segmented computational method significantly improved the AAD accuracy even under low SMR levels and with the short decoding window lengths. Lastly, the proposal segmented AAD model was further combined with advanced speech processing algorithms to develop an intention-adaptive speech signal processing system in the competing-speaker environments. In order to apply such a neurofeedback-based speech signal processing system in the real-life scenes, subjects were required to focus or switch their attention between the competing speakers according to the experimental requirements. Results showed that the cortical tracking ability to the target speech streams could be a reliable biomarker to reflect dynamics of auditory attention states. The neurofeedback-based intention-adaptive system could facilitate the target speech perception under the different SMRs when the auditory attention was dynamically switched from one to the other speaker stream. These findings indicated that the neurofeedback-based speech separation system has the potential to improve target speech perception in the complex auditory scenes |
Keywords | |
Language | English
|
Training classes | 联合培养
|
Document Type | Thesis |
Identifier | http://kc.sustech.edu.cn/handle/2SGJ60CL/406323 |
Department | Department of Electrical and Electronic Engineering |
Recommended Citation GB/T 7714 |
Wang L. A Segmented Auditory Attention Decoding Model and Its Application to Neurofeedback Based Target Speech Perception[D]. 香港. 香港大学,2022.
|
Files in This Item: | ||||||
File Name/Size | DocType | Version | Access | License | ||
A Segmented Auditory(10684KB) | Restricted Access | -- | Fulltext Requests |
|
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment