网络知识资源表示学习模型  被引量:1

A learning model for representation of knowledge resources on the Web

在线阅读下载全文

作  者:朱国进[1] 李承前 

机构地区:[1]东华大学计算机科学与技术学院,上海201620

出  处:《智能计算机与应用》2016年第3期5-10,共6页Intelligent Computer and Applications

摘  要:随着电子计算机技术和互联网的快速发展,网络知识资源呈爆炸式增长,人们往往不能有效地获取、利用所需的网络知识资源。为了更好地利用网络知识资源,需要应用自动化、智能化的数据挖掘、信息提取方法。Web文档作为网络知识资源的一种载体,有着自然语言非结构化的特点,所以在运用聚类、分类等挖掘技术进行文本挖掘之前,需要将Web文档转化为机器学习算法可以理解的格式,即将文本数据转换成数值数据。针对现有常用文本表示方法的局限性,本文提出了一种基于命名实体和词向量相结合的网络知识资源表示学习模型。并在算法知识领域内进行实现与应用探索,包括网络解题报告的聚类和对网络解题报告的搜索,实验结果显示本文提出的方法在这些任务上取得了较好的效果。With the rapid development of computer technology and the Internet, the network knowledge resources are increasing, peopleoften can not effectively access and use the network knowledge resources. In order to make better use of the network knowledgeresources, the application of automation and intelligent data mining and information extraction methods are needed. As a carrier ofknowledge resource on Web, Web document was non structured natural language, so before in using clustering and classification miningtechnology to text mining, the web document is required to be transformed into the format which can be understood for machine learningalgorithms, that is to realize the conversion text data into numerical data. In view of the limitations of the existing common textrepresentation methods, this paper proposes a network knowledge resource representation learning model based on the combination ofnamed entity and word vector. And the paper discusses the implementation and application in the field of algorithm of knowledge,including clustering network solving report and search for network problem solving report. The experimental results show that methodpresented in this paper on these tasks achieved good results.

关 键 词:文本表示 命名实体识别 条件随机场 算法知识 词向量 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象