一种基于时空频多维特征的短时窗口脑电听觉注意解码网络  

A Short-time Window ElectroEncephaloGram Auditory Attention Decoding Network Based on Multi-dimensional Characteristics of Temporal-spatial-frequency

在线阅读下载全文

作  者:王春丽[1] 李金絮 高玉鑫 王晨名 张珈豪 WANG Chunli;LI Jinxu;GAO Yuxin;WANG Chenming;ZHANG Jiahao(School of Electronic and Information Engineering,Lanzhou Jiaotong University,Lanzhou 730000,China)

机构地区:[1]兰州交通大学电子与信息工程学院,兰州730000

出  处:《电子与信息学报》2025年第3期814-824,共11页Journal of Electronics & Information Technology

基  金:兰州交通大学-天津大学高校联合创新基金(LH2023002),天津市自然科学基金(21JCZXJC00190)。

摘  要:在鸡尾酒会场景中,听力正常的人有能力选择性地注意特定的说话者语音,但听力障碍者在这种场景中面临困难。听觉注意力解码(AAD)的目的是通过分析听者的脑电信号(EEG)响应特征来推断听者关注的是哪个说话者。现有的AAD模型只考虑脑电信号的时域或频域单个特征或二者的组合(如时频特征),而忽略了时-空-频域特征之间的互补性,这在一定程度上限制了模型的分类能力,进而影响了模型在决策窗口上的解码精度。同时,已有AAD模型大多在长时决策窗口(1~5 s)中有较高的解码精度。该文提出一种基于时-空-频多维特征的短时窗口脑电信号听觉注意解码网络(TSF-AADNet),用于提高短时决策窗口(0.1~1 s)的解码精度。该模型由两个并行的时空、频空特征提取分支以及特征融合和分类模块组成,其中,时空特征提取分支由时空卷积块和高阶特征交互模块组成,频空特征提取分支采用基于频空注意力的3维卷积模块(FSA-3DCNN),最后将双分支网络提取的时空和频空特征进行融合,得到最终的听觉注意力二分类解码结果。实验结果表明,TSF-AADNet模型在听觉注意检测数据集KULeuven(KUL)和听觉注意检测的脑电和音频数据集(DTU)的0.1 s决策窗口下,解码精度分别为91.8%和81.1%,与最新的AAD模型一种基于时频融合的双分支并行网络(DBPNet)相比,分别提高了5.40%和7.99%。TSF-AADNet作为一种新的短时决策窗口的AAD模型,可为听力障碍诊断以及神经导向助听器研发提供有效参考。Objective In cocktail party scenarios,individuals with normal hearing can selectively focus on specific speakers,whereas individuals with hearing impairments often struggle in such environments.Auditory Attention Decoding(AAD)aims to infer the speaker that a listener is attending to by analyzing their brain’s electrical response,recorded through ElectroEncephaloGram(EEG).Existing AAD models typically focus on a single feature of EEG signals in the time domain,frequency domain,or time-frequency domain,often overlooking the complementary characteristics across the time-space-frequency domain.This limitation constrains the model’s classification ability,ultimately affecting decoding accuracy within a decision window.Moreover,while many current AAD models exhibit high accuracy over long-term decision windows(1~5 s),real-time AAD in practical applications necessitates a more robust approach to short-term EEG signals.Methods This paper proposes a short-window EEG auditory attention decoding network,Temporal-Spatial-Frequency Features-AADNet(TSF-AADNet),designed to enhance decoding accuracy in short decision windows(0.1~1 s).TSF-AADNet decodes the focus of auditory attention from EEG signals,eliminating the need for speech separation.The model consists of two parallel branches:one for spatiotemporal feature extraction,and another for frequency-space feature extraction,followed by feature fusion and classification.The spatiotemporal feature extraction branch includes a spatiotemporal convolution block,a high-order feature interaction module,a two-dimensional convolution layer,an adaptive average pooling layer,and a Fully Connected(FC)layer.The spatiotemporal convolution block can effectively extract EEG features across both time and space dimensions,capturing the correlation between signals at different time points and electrode positions.The high-order feature interaction module further enhances feature interactions at different levels,improving the model’s feature representation ability.The frequency-space featur

关 键 词:脑电信号 听觉注意力解码 短时决策窗口 时空频特征 神经导向助听器 

分 类 号:TN911.7[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象