基于计算技术的语音语料库标注方法研究  

Research on Annotation Method of Speech Corpus Based on Computing Technology

在线阅读下载全文

作  者:杨政 马延周 YANG Zheng;MA Yanzhou(Strategic Support Force Information Engineering University,Luoyang Henan 471000)

机构地区:[1]战略支援部队信息工程大学,河南洛阳471000

出  处:《软件》2023年第3期167-169,共3页Software

摘  要:在现代信息技术革新发展中,科研学者在开发设计语音识别系统时,需要利用正确的音标标注和词汇标注,构建规范有效的语音语言模型。由于在语音语料库中添加音标和词汇标注,需要消耗大量的人力物力,并且现有系统无法实现自动标注,所以只能利用手工标注来完成。本文在了解数据标注技术研究现状的基础上,根据语音语料库的标注形式和影响因素,分析以计算技术为核心的语音语料库标注方法,而后结合实践应用结果进行验证分析。最终结果显示,利用计算技术进行标注,能有效低成本的生成词汇和音标的标注。In the innovation and development of modern information technology,researchers need to make use of correct phonetic symbols and vocabulary annotations to build standard and effective speech language models when developing and designing speech recognition systems.It takes a lot of manpower and material resources to add phonetic symbols and vocabulary annotations to the speech corpus,and the existing system cannot realize automatic annotation,so it can only be completed by manual annotation.On the basis of understanding the research status of data annotation technology,this paper analyzes the annotation methods of speech corpus based on computing technology according to the annotation forms and influencing factors of speech corpus,and then carries out verification analysis based on the practical application results.The final results show that using computing technology to annotate can effectively and cheaply generate the annotation of words and phonetic symbols.

关 键 词:计算技术 语音语料库 标注方法 音标标注 词汇标注 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象