检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:宋伟[1,2,3] 张杨豪 SONG Wei;ZHANG Yanghao(School of Information Engineering,Minzu University of China,Beijing 100081,China;National Language Resource Monitoring and Research Center for Minority Languages,Beijing 100081,China;Key Laboratory of Ethnic Language Intelligent Analysis and Security Governance of MOE,Beijing 100081,China)
机构地区:[1]中央民族大学信息工程学院,北京100081 [2]国家语言资源监测与研究少数民族语言中心,北京100081 [3]民族语言智能分析与安全治理教育部重点实验室,北京100081
出 处:《计算机工程与应用》2024年第11期62-74,共13页Computer Engineering and Applications
摘 要:构音障碍作为一种医学难症,目前主流的语音识别技术并不能很好地适应这一领域的需求。同时针对构音障碍的语音识别技术利用预训练及个性化训练相结合的方式,通过数据驱动进一步提升了算法性能,识别字错误率进一步降低,但是目前针对构音障碍的语音识别技术离实际商用还存在一定的距离,该技术的发展受数据规模和技术的限制。到目前为止,尚未出现针对构音障碍语音识别方面的综述文章,亟需将该领域中各种数据集的构建方法和先进技术进行对比分析,以方便进入该领域的研究人员快速获取这方面的知识。对现有数据集、主流算法、评估方式进行了调研,总结了国内外主流构音障碍数据集的规模、形式和特点。分析了构音障碍语音识别的主流算法,并给出了不同算法的性能和特点。最后,研究了基于构音障碍患者的严重等级的算法模型性能评价指标,并讨论了未来的研究方向,以期能够为从事构音障碍语音识别的研究人员提供帮助,助力该领域的快速发展。Articulation disorder,as a medical difficulty,currently mainstream speech recognition technologies are not well adapted to the needs of this field.At the same time,speech recognition technology for dysarthria utilizes a combination of pre training and personalized training to further improve algorithm performance and reduce recognition word error rate through data-driven methods.However,currently,speech recognition technology for dysarthria still has a certain distance from practical commercial use,and its development is limited by data scale and technology.So far,there have been no comprehensive articles on speech recognition for dysarthria.It is urgent to compare and analyze the construction methods and advanced technologies of various datasets in this field,in order to facilitate researchers entering the field to quickly acquire knowledge in this field.This paper conducts a survey on existing datasets,mainstream algorithms,and evaluation methods,and summarizes the scale,form,and characteristics of mainstream speech impairment datasets at home and abroad.It analyzes the mainstream algorithms for speech recognition with dysarthria,and provides the performance and characteristics of different algorithms.Finally,the performance evaluation indicators of the algorithm model based on the severity level of patients with dysarthria are studied,and future research directions are discussed,in order to provide help for the researchers engaged in speech recognition with dysarthria and assist in the rapid development of this field.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.144.17.93