基于词向量的无监督词义消歧方法  被引量:3

Unsupervised Word Disambiguation Method Based on Word Embeddings

在线阅读下载全文

作  者:吕晓伟[1] 章露露[1] LV Xiao-wei;ZHANG Lu-lu(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)

机构地区:[1]昆明理工大学信息工程与自动化学院,云南昆明650500

出  处:《软件导刊》2018年第9期193-195,共3页Software Guide

摘  要:词义消歧在多个领域有重要应用。基于Lesk及其改进算法是无监督词义消歧研究的典型代表,但现有算法多基于上下文与义项词覆盖,通常未考虑上下文中词与歧义词的距离影响。为此提出一种基于词向量的词义消歧方法,利用向量表示上下文以及义项,并考虑融合上下文与义项的语义相似度及义项分布频率进行词义消歧。在Senseval-3数据集上测试,结果表明,该方法能有效实现词义消歧。Word sense disambiguation have important applications in many fields.Lesk algorithm and its improved algorithm are typical representatives of unsupervised word-sense disambiguation.However,most of the existing algorithms are mostly based on word coverage of context and gloss.In addition,the effect of distance between ambiguous words and word in context is not considered.This paper proposes a method of word-sense disambiguation based on word vectors,which uses vectors to represent contexts and gloss and also considers combined semantic similarity between context and gloss with the distribution frequency of gloss.The test results on the Senseval-3 dataset show that this method can effectively achieve word-sense disambiguation.

关 键 词:词义消歧 词向量 自然语言处理 机器翻译 Word2vec 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象