基于编辑约束的端到端越南语文本正则化方法

End-to-end Vietnamese text normalization method based on editing constraints

作　　者：蒋铭王琳钦赖华[1,2] 高盛祥[1,2] JIANG Ming;WANG Linqin;LAI Hua;GAO Shengxiang(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming Yunnan 650504,China;Key Laboratory of Artificial Intelligence in Yunnan Province(Kunming University of Science and Technology),Kunming Yunnan 650500,China)

机构地区：[1]昆明理工大学信息工程与自动化学院,昆明650504 [2]云南省人工智能重点实验室(昆明理工大学),昆明650500

出　　处：《计算机应用》2025年第2期362-370,共9页journal of Computer Applications

基　　金：国家自然科学基金资助项目(62376111,U21B2027,62366027,61972186);云南高新技术产业发展项目(201606);云南省重点研发计划项目(202302AD080003,202303AP140008);云南省基础研究计划项目(202001AS070014);云南省学术和技术带头人后备人才计划项目(202105AC160018)。

摘　　要：文本正则化是语音合成(TTS)前端分析任务中不可或缺的步骤,而语义歧义性是文本正则化任务面临的主要问题,比如数字、日期、时间等非标准词的语义歧义性。针对该问题,提出一种基于编辑约束的端到端文本正则化方法,并且在充分考虑越南语的语言特点后,设计专门用于越南语的标注方法,以提高模型对上下文语义信息的建模能力。此外,针对神经网络模型容易产生不可恢复性错误的问题,提出一种编辑对齐算法以有效约束非标准词文本的范围,减小解码端的搜索空间,从而避免模型自身局限性所导致的非正则化文本预测错误。选取FastCorrect模型作为基准模型,将各类优化方法应用到基准模型中得到新模型。实验结果表明,所提模型在越南语不同优化方式的对比实验中的精准率相比使用无标注数据的基准模型提高了23.71个百分点,在同类中文实验中的精准率提高了26.24个百分点。可见,所提方法不仅在越南语上表现出色,而且在中文开源数据上也取得了显著的效果,验证了该方法在越南语之外的适用性。而且,与六类基线模型相比,使用所提方法的模型取得了最高的97.14%的精准率,在F1值上超过加权有限状态转换器(WFST)的两阶段方法2.29个百分点,证明了所提方法在文本正则化任务上的优越性。Text normalization is considered an indispensable step in frontend analysis task of Text-To-Speech(TTS),and semantic ambiguity is the main challenge faced by text normalization tasks,particularly semantic ambiguity of nonstandard words such as numbers,dates,and time.Aiming at the problem,an editing constraint-based end-to-end text normalization method was proposed,and after fully considering linguistic characteristics of Vietnamese,a specialized labelling method was designed for Vietnamese to enhance the model's modeling capability of contextual semantic information.Furthermore,addressing the issue of irreparable errors generated by neural network models easily,an editing alignment algorithm was proposed to constrain the scope of non-standard word text effectively,thereby reducing search space at the decoding end and avoiding prediction errors of non-normalized text caused by limitations of the model itself.With FastCorrect model selected as the baseline model,various optimization methods were applied to the model to obtain new models.Experimental results indicate that the proposed model achieves a 23.71 percentage point increase in precision compared to the baseline model using unlabeled data in Vietnamese experiments of different optimization methods,and a 26.24 percentage point increase in precision in similar Chinese experiments.It can be observed that the method not only performs well in Vietnamese but also demonstrates significant effects on Chinese open-source data,confirming its applicability beyond Vietnamese.Moreover,the model using the proposed method surpasses six baseline models with an precision of 97.14%and outperforms the Weighted Finite-State Transducer(WFST)two-stage method by 2.29 percentage points in F1-score,verifying superiority of the proposed method in text normalization tasks.

关键词：越南语文本正则化编辑对齐算法语音合成端到端

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于编辑约束的端到端越南语文本正则化方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于编辑约束的端到端越南语文本正则化方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索