歧义词挖掘与机器翻译优化方法  被引量:1

Methods of ambiguous words mining and machine translation optimization

在线阅读下载全文

作  者:孙李丽 郭琳[2] 张文诺 SUN Li-li;GUO Lin;ZHANG Wen-nuo(School of Humanities,Shangluo University,Shangluo 726000,Shaanxi Province,China;Electonic Information and Electrical Engineering College,Shangluo University,Shangluo 726000,Shaanxi Province,China)

机构地区:[1]商洛学院人文学院,陕西商洛726000 [2]商洛学院电信学院,陕西商洛726000

出  处:《信息技术》2022年第8期27-32,37,共7页Information Technology

基  金:商洛学院教育教学改革项目(19jyjx122);国家社会科学基金重大项目(15ZDB099)。

摘  要:为了提高机器翻译的准确性和效率,挖掘识别歧义词和优化翻译算法成为技术关键。文中提出,基于文学作品建立歧义词的高频词典和低频词典,提炼出固定词式,迭代筛选最佳词义项,然后利用特征对齐置信度,对中英双语文本进行对齐识别,实现消岐目的。最后采用评价指标,测试了传统算法与本文算法的翻译性能。结果表明,本文算法的消岐性能高于传统算法,长篇小说高于中篇小说。本文算法在人名、称呼、物品和俚语等固定词式的筛选与识别方面性能突出,能快速适应作品特色语言,降低歧义词翻译的错误率,提高机器翻译质量。In order to improve the accuracy and efficiency of machine translation,mining and recognition of ambiguous words and optimization of translation algorithm become the key technology.In this paper,a dictionary including high-frequency and low-frequency ambiguous words is established based on literature,from which the fixed word forms are extracted and the best word meanings are obtained through iterative selection.Then,with the feature alignment confidence,the bilingual texts in Chinese and English are aligned and recognized to achieve the purpose of disambiguation.Finally,the translation performance of the existing algorithm and the new algorithm are tested with the evaluation index.The results show that the disambiguation performance of new algorithm is higher than that of the existing algorithms,and the effect in a novel is higher than in the novella.The performance of new algorithm is outstanding in recognition and fixed word form selection such as name,address,thing and slang name,which can quickly adapt to the special language in literature and reduce the error rate of ambiguous word translation to improve the quality of machine translation.

关 键 词:机器翻译 歧义词 挖掘与识别 消岐 乡土小说 

分 类 号:TP391[自动化与计算机技术—计算机应用技术] H319[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象