检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:魏伟[1,2] 孟祥主[3] 郭崇慧 WEI Wei;MENG Xiangzhu;GUO Chonghui(Center for Energy,Environment&Economy Research,Zhengzhou University,Zhengzhou 450001,China;Institution of Systems Engineering,Dalian University of Technology,Dalian 116024,China;School of Computer Science and Technology,Dalian University of Technology,Dalian 116024,China)
机构地区:[1]郑州大学能源-环境-经济研究中心,郑州450001 [2]大连理工大学系统工程研究所,大连116024 [3]大连理工大学计算机科学与技术学院,大连116024
出 处:《系统工程理论与实践》2020年第5期1293-1303,共11页Systems Engineering-Theory & Practice
基 金:国家自然科学基金(71771034);揭阳市科技计划项目(2017xm041)。
摘 要:特征选择是文本挖掘领域中重要的基础性工作,能够为后续文本挖掘任务的顺利实施提供良好的数据处理方法和技术支持,而特征词排序是特征选择的关键环节.结合文本统计信息和结构信息以及流形排序思想,提出了一种新的特征词排序方法.通过构造原始文本中潜在的能够反映文本语义和结构信息的条件共现度词网络作为特征词间的流形结构,并以特征词的词频统计信息作为特征词初始权重,结合流形排序思想以及图学习理论进行特征词间的相似性学习,进而实现对特征词重要性排序.分别在公共语料集和补充语料集上与其它多种特征词排序方法进行数值实验对比,实验结果验证了方法的有效性.该方法拓宽了流形排序思想和图学习理论在文本挖掘领域的应用,也给单篇文本特征词排序提供了新的方法和策略.Feature selection is an important basic work in the field of text mining,which can provide reliable data processing methods and technical support for the implementation of subsequent text mining tasks smoothly.At the same time,feature word ranking is the key part of feature selection.In this research,we propose a word ranking method based on manifold ranking in combination with the textual statistics and structural information.Combining with the idea of manifold ranking,we construct the text’s conditional co-occurrence degree word network,which can reflect the semantic and structural information of text,and the network is treated as the potential manifold structure.Taking the term frequency as the original ranking result,and then the words’ weights and ranking are reevaluated and optimized by using the similarity learning of words with the graph learning theory and manifold ranking theory.Numerical experiments are compared with other word ranking methods on both public datasets and supplementary corpus,which all verify the effectiveness of the proposed method.In addition,this method broadens the application of graph learning theory in the field of text mining,and it also provides a new method and strategy for word ranking in single document.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49