检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]北京交通大学交通数据分析与挖掘北京市重点实验室,北京100044
出 处:《智能系统学报》2017年第5期661-667,共7页CAAI Transactions on Intelligent Systems
基 金:国家自然科学基金项目(61370129;61375062;61632004);长江学者和创新团队发展计划资助项目(IRT201206)
摘 要:词向量在自然语言处理中起着重要的作用,近年来受到越来越多研究者的关注。然而,传统词向量学习方法往往依赖于大量未经标注的文本语料库,却忽略了单词的语义信息如单词间的语义关系。为了充分利用已有领域知识库(包含丰富的词语义信息),文中提出一种融合语义信息的词向量学习方法(KbEMF),该方法在矩阵分解学习词向量的模型上加入领域知识约束项,使得拥有强语义关系的词对获得的词向量相对近似。在实际数据上进行的单词类比推理任务和单词相似度量任务结果表明,KbEMF比已有模型具有明显的性能提升。Word representation plays an important role in natural language processing and has attracted a great deal of attention from many researchers due to its simplicity and effectiveness. However,traditional methods for learning word representations generally rely on a large amount of unlabeled training data,while neglecting the semantic information of words,such as the semantic relationship between words. To sufficiently utilize knowledge bases that contain rich semantic word information in existing fields,in this paper,we propose a word representation learning method that incorporates semantic information( KbEMF). In this method,we use matrix factorization to incorporate field knowledge constraint items into a learning word representation model,which identifies words with strong semantic relationships as being relatively approximate to the obtained word representations. The results of word analogy reasoning tasks and word similarity measurement tasks obtained using actual data show the performance of KbEMF to be superior to that of existing models.
关 键 词:自然语言处理 词向量 矩阵分解 语义信息 知识库
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222