检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:吕晓伟[1] 章露露[1] LV Xiao-wei;ZHANG Lu-lu(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)
机构地区:[1]昆明理工大学信息工程与自动化学院,云南昆明650500
出 处:《软件导刊》2018年第9期193-195,共3页Software Guide
摘 要:词义消歧在多个领域有重要应用。基于Lesk及其改进算法是无监督词义消歧研究的典型代表,但现有算法多基于上下文与义项词覆盖,通常未考虑上下文中词与歧义词的距离影响。为此提出一种基于词向量的词义消歧方法,利用向量表示上下文以及义项,并考虑融合上下文与义项的语义相似度及义项分布频率进行词义消歧。在Senseval-3数据集上测试,结果表明,该方法能有效实现词义消歧。Word sense disambiguation have important applications in many fields.Lesk algorithm and its improved algorithm are typical representatives of unsupervised word-sense disambiguation.However,most of the existing algorithms are mostly based on word coverage of context and gloss.In addition,the effect of distance between ambiguous words and word in context is not considered.This paper proposes a method of word-sense disambiguation based on word vectors,which uses vectors to represent contexts and gloss and also considers combined semantic similarity between context and gloss with the distribution frequency of gloss.The test results on the Senseval-3 dataset show that this method can effectively achieve word-sense disambiguation.
关 键 词:词义消歧 词向量 自然语言处理 机器翻译 Word2vec
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.144.178.2