检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:蒋铭 王琳钦 赖华[1,2] 高盛祥[1,2] JIANG Ming;WANG Linqin;LAI Hua;GAO Shengxiang(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming Yunnan 650504,China;Key Laboratory of Artificial Intelligence in Yunnan Province(Kunming University of Science and Technology),Kunming Yunnan 650500,China)
机构地区:[1]昆明理工大学信息工程与自动化学院,昆明650504 [2]云南省人工智能重点实验室(昆明理工大学),昆明650500
出 处:《计算机应用》2025年第2期362-370,共9页journal of Computer Applications
基 金:国家自然科学基金资助项目(62376111,U21B2027,62366027,61972186);云南高新技术产业发展项目(201606);云南省重点研发计划项目(202302AD080003,202303AP140008);云南省基础研究计划项目(202001AS070014);云南省学术和技术带头人后备人才计划项目(202105AC160018)。
摘 要:文本正则化是语音合成(TTS)前端分析任务中不可或缺的步骤,而语义歧义性是文本正则化任务面临的主要问题,比如数字、日期、时间等非标准词的语义歧义性。针对该问题,提出一种基于编辑约束的端到端文本正则化方法,并且在充分考虑越南语的语言特点后,设计专门用于越南语的标注方法,以提高模型对上下文语义信息的建模能力。此外,针对神经网络模型容易产生不可恢复性错误的问题,提出一种编辑对齐算法以有效约束非标准词文本的范围,减小解码端的搜索空间,从而避免模型自身局限性所导致的非正则化文本预测错误。选取FastCorrect模型作为基准模型,将各类优化方法应用到基准模型中得到新模型。实验结果表明,所提模型在越南语不同优化方式的对比实验中的精准率相比使用无标注数据的基准模型提高了23.71个百分点,在同类中文实验中的精准率提高了26.24个百分点。可见,所提方法不仅在越南语上表现出色,而且在中文开源数据上也取得了显著的效果,验证了该方法在越南语之外的适用性。而且,与六类基线模型相比,使用所提方法的模型取得了最高的97.14%的精准率,在F1值上超过加权有限状态转换器(WFST)的两阶段方法2.29个百分点,证明了所提方法在文本正则化任务上的优越性。Text normalization is considered an indispensable step in frontend analysis task of Text-To-Speech(TTS),and semantic ambiguity is the main challenge faced by text normalization tasks,particularly semantic ambiguity of nonstandard words such as numbers,dates,and time.Aiming at the problem,an editing constraint-based end-to-end text normalization method was proposed,and after fully considering linguistic characteristics of Vietnamese,a specialized labelling method was designed for Vietnamese to enhance the model's modeling capability of contextual semantic information.Furthermore,addressing the issue of irreparable errors generated by neural network models easily,an editing alignment algorithm was proposed to constrain the scope of non-standard word text effectively,thereby reducing search space at the decoding end and avoiding prediction errors of non-normalized text caused by limitations of the model itself.With FastCorrect model selected as the baseline model,various optimization methods were applied to the model to obtain new models.Experimental results indicate that the proposed model achieves a 23.71 percentage point increase in precision compared to the baseline model using unlabeled data in Vietnamese experiments of different optimization methods,and a 26.24 percentage point increase in precision in similar Chinese experiments.It can be observed that the method not only performs well in Vietnamese but also demonstrates significant effects on Chinese open-source data,confirming its applicability beyond Vietnamese.Moreover,the model using the proposed method surpasses six baseline models with an precision of 97.14%and outperforms the Weighted Finite-State Transducer(WFST)two-stage method by 2.29 percentage points in F1-score,verifying superiority of the proposed method in text normalization tasks.
关 键 词:越南语 文本正则化 编辑对齐算法 语音合成 端到端
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7