检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:乔猛 刘慧君[1] 梁光辉[2] QIAO Meng;LIU Huijun;LIANG Guanghui(Institute of Computer, Chongqing University, Chongqing 400000, China;Information Engineering University, Zhengzhou 450001, China)
机构地区:[1]重庆大学计算机学院,重庆400000 [2]信息工程大学,河南郑州450001
出 处:《信息工程大学学报》2018年第4期447-452,共6页Journal of Information Engineering University
基 金:国家自然科学基金资助项目(61572518)
摘 要:词向量(Word2Vec)是近些年来自然语言处理领域的重要算法,在近几年的人工智能发展中占有极其重要的地位。通过向量空间的形式对每一个词进行标志,进而在概率方面上对词进行表示。Word Mover Distance算法是Earth Mover Distance的一个特殊形式,用来计算一组向量之间最短距离。文章使用上述两个算法作为基底,对词向量进行相关的空间映射预处理操作,作为WMD(word mover distance)的输入参数,最终可以得到句子间相似度得分。实验表明,该方法使相似语句与不相关语句之间的距离差距更大,且在专家系统中相似问句之间的距离更加紧密,更能显著刻画句子之间的语义相似程度,有利于增加短文本匹配的准确度。Word vector (Word2Vec) is an important algorithm in the field of natural language processing in recent years. Accordingly, it has a very important position in the development of artificial intelligence. It can express each word in the form of vector space, and it can express the word in terms of probability. The Word Mover Distance algorithm is a special form of the Earth Mover Distance used to calculate the shortest distance between a set of vectors. Based on the above two algorithms, the article uses the word vector for the relevant spatial mapping preprocessing operation to get word mover distance (WMD) input parameters, and ultimately the similarity score of two words could be obtained. Experiments show that the method can make the distance between the similar statements and the distance between the irrelevant statements greater, and the distance between the similar question sentences in the expert system closer. Thus, the semantics between sentences similarity is more clearly described, so as to increase the accuracy of short text matching.
分 类 号:TP312[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117