基于VCK-vector模型的词义消歧方法  被引量:1

Word Sense Disambiguation Based on VCK Vector Model

在线阅读下载全文

作  者:戴洪涛 侯开虎[1] 周洲 肖灵云 DAI Hong-tao;HOU Kai-hu;ZHOU Zhou;XIAO Ling-yun(School of mechanical and electrical engineering,Kunming University of science and technology,Kunming 650500,Yunnan Province)

机构地区:[1]昆明理工大学机电工程学院

出  处:《软件》2020年第2期134-140,共7页Software

摘  要:自然语言处理(NLP)旨在如何让计算机更好的理解人类的语言,但是在自然语言中句段、词汇本身存在多义和歧义,计算机无法将其转换为能识别的二进制编码,这是当下NLP领域内存在的最大问题。本文将Viterbi算法的词性标注模型、CBOW语言模型及K-Means聚类算法组合,构建一种基于词向量的多义词组合消歧模型(VCK-Vector)。通过词性分布对比、语义相关度任务和聚类效果分析等方法评测模型,最后通过百度AI词向量与模型输出结果进行对比。结果显示基于VCK-vector模型在实际场景运用中是可行的。Natural Language Processing(NLP)aims to make computers better understand human language.However in natural language,there are polysemy and ambiguity in sentence segment and vocabulary,and computers cannot convert them into recognizable binary codes.This is the biggest problem in the field of NLP.This paper combined the part of speech tagging model of Viterbi algorithm,CBOW language model and K-Means clustering algorithm to construct a polysemous word combination disambiguation model(VCK-Vector)based on word vector.The model was evaluated by comparing part-of-speech distribution,semantic correlation task and clustering effect analysis.Finally,Baidu AI word vector was compared with the output of the model.The results are showed that the paper propose polysemous word combination disambiguation model(VCK-Vector)based is feasible in scene application.

关 键 词:自然语言处理 多义词消歧 VCK-vector模型 

分 类 号:TP391.9[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象