检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:戴洪涛 侯开虎[1] 周洲 肖灵云 DAI Hong-tao;HOU Kai-hu;ZHOU Zhou;XIAO Ling-yun(School of mechanical and electrical engineering,Kunming University of science and technology,Kunming 650500,Yunnan Province)
机构地区:[1]昆明理工大学机电工程学院
出 处:《软件》2020年第2期134-140,共7页Software
摘 要:自然语言处理(NLP)旨在如何让计算机更好的理解人类的语言,但是在自然语言中句段、词汇本身存在多义和歧义,计算机无法将其转换为能识别的二进制编码,这是当下NLP领域内存在的最大问题。本文将Viterbi算法的词性标注模型、CBOW语言模型及K-Means聚类算法组合,构建一种基于词向量的多义词组合消歧模型(VCK-Vector)。通过词性分布对比、语义相关度任务和聚类效果分析等方法评测模型,最后通过百度AI词向量与模型输出结果进行对比。结果显示基于VCK-vector模型在实际场景运用中是可行的。Natural Language Processing(NLP)aims to make computers better understand human language.However in natural language,there are polysemy and ambiguity in sentence segment and vocabulary,and computers cannot convert them into recognizable binary codes.This is the biggest problem in the field of NLP.This paper combined the part of speech tagging model of Viterbi algorithm,CBOW language model and K-Means clustering algorithm to construct a polysemous word combination disambiguation model(VCK-Vector)based on word vector.The model was evaluated by comparing part-of-speech distribution,semantic correlation task and clustering effect analysis.Finally,Baidu AI word vector was compared with the output of the model.The results are showed that the paper propose polysemous word combination disambiguation model(VCK-Vector)based is feasible in scene application.
关 键 词:自然语言处理 多义词消歧 VCK-vector模型
分 类 号:TP391.9[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.38