检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:秦颖[1]
出 处:《现代图书情报技术》2014年第7期114-119,共6页New Technology of Library and Information Service
基 金:校级科研专项基金项目"基于平行语料库的学生译文自动评价研究与实现"(项目编号:2009JJ056);全国教育科学规划课题"计算机辅助音译系统的研究与实现"(项目编号:GPA115033)的研究成果之一
摘 要:【目的】在英汉跨语言剽窃文档中检索翻译对应内容。【方法】基于双语词典进行相似分析,合并整理词典以提高词语级匹配的准确率和效率,利用整体词频分布、匹配位置特征等解决歧义和多重匹配问题,根据词的对应情况、词的位置信息等综合加权计算句子及段落的相似度。【结果】在真实翻译语料上的实验结果表明,检索的准确率为0.841,召回率为0.748。【局限】未登录词的翻译关系不易根据词典判定。【结论】基于双语词典检索跨语言相似内容的方法简单易行,适用面广。[Objective] Translation correspondence in English-Chinese cross-lingual plagiarism documents is studied. [Methods] Similarity analysis is taken according to bilingual lexicons. To improve the precision and efficiency of corresponding words recognition, this study merges and sorts several bilingual lexicons. As to the problems of disambiguation and multiple matching, the paper proposes a method which applies word distribution and matching location to select the proper translation items. Similarities between sentences and paragraphs are defined on the stratified complex features such as word matching category, position of words and so on. [Results] Experiments on real translation documents show that precision and recall of retrieval reach 0.841 and 0.748 respectively. [Limitations] Out of Vocabulary (OOV) correspondence is still hard to judge by lexicons. [Conclusions] The approach of cross-lingual similarity detection based on bilingual lexicons is easy to implement and has a wide range of application.
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249