构音障碍说话人自适应研究进展及展望  

Advancements and Prospects in Dysarthria Speaker Adaptation

在线阅读下载全文

作  者:康新晨 董雪燕 姚登峰 钟经华[1] KANG Xinchen;DONG Xueyan;YAO Dengfeng;ZHONG Jinghua(Beijing Key Laboratory of Information Service Engineering,Beijing Union University,Beijing 100101,China;Lab of Computational Linguistics,School of Humanities,Tsinghua University,Beijing 100084,China;Center for Psychology and Cognitive Science,Tsinghua University,Beijing 100084,China)

机构地区:[1]北京联合大学北京市信息服务工程重点实验室,北京100101 [2]清华大学人文学院计算语言学实验室,北京100084 [3]清华大学心理学与认知科学研究中心,北京100084

出  处:《计算机科学》2024年第8期11-19,共9页Computer Science

基  金:北京市自然科学基金(4202028);国家语言文字工作委员会项目(YB145-25);国家自然科学基金(62036001);国家社会科学基金(21BYY106,21&ZD292);2019年度北京市教育委员会科技一般项目(KM201911417005)。

摘  要:自动化语音识别工具让构音障碍者和正常人的沟通变得顺畅,因此,近年来构音障碍语音识别成为了一项热门研究。构音障碍语音识别的研究包括:收集构音障碍者和正常人的发音数据,对构音障碍者和正常人的语音进行声学特征表示,利用机器学习模型比较和识别发音的内容并定位出差异性,以帮助构音障碍者改善发音。然而,由于收集构音障碍者的大量语音数据非常困难,且构音障碍者存在发音的强变异性,导致通用语音识别模型的效果往往不佳。为了解决这一问题,许多研究提出将说话人自适应方法引入构音障碍语音识别。对大量相关文献进行调研发现,当前此类研究主要围绕特征域和模型域对构音障碍语音进行分析。文中重点分析特征变换和辅助特征如何解决语音特征的差异性表示,以及声学模型的线性变换、微调声学模型参数和基于数据选择的域自适应方法如何提高模型识别的准确率。最后总结出构音障碍说话人自适应研究当前遇到的问题,并指出未来的研究可以从语音变异性的分析、多特征多模态数据的融合以及基于小数量的自适应方法的角度,提升构音障碍语音识别模型的有效性。Automatic speech recognition tools make communication between dysarthria and normal individuals smoother,therefore,dysarthric speech recognition has become a hot research topic in recent years.The research on dysarthric speech recognition includes:collecting pronunciation data from dysarthria and normal individuals,representing acoustic features of dysarthria speech and normal speech,comparing and recognizing the content of pronunciation by machine learning model,and locating differences,so as to help dysarthria to improve their pronunciation.However,due to the significant difficulties in collecting a large amount of speech data from dysarthria,and the strong variability of their pronunciation,the performance of universal speech recognition models is often poor.To address this issue,many studies have proposed to introduce speaker adaptation methods into dysarthric speech recognition.Through extensive research on relevant literature,it has been found that current research mainly focuses on analyzing dysarthria speech in the feature domain and model domain.This paper focuses on analyzing how feature transformation and auxiliary features solve the differential representation of speech features,how linear transformation of acoustic models,fine-tuning of acoustic model parameters,and domain adaptation methods based on data selection improve the accuracy of model recognition.Finally,the current problems encountered in the research of dysarthria speaker adaptation are summarized,and it is pointed out that future research can improve the effectiveness of dysarthric speech recognition models from the perspectives of analyzing speech variability,fusing multi-feature and multi-modal data,and using a small number of speaker adaptation methods.

关 键 词:构音障碍 说话人自适应 辅助特征 变换 微调 域自适应 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象