检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:黄炜 黄建桥 李岳峰 Huang Wei;Huang Jianqiao;Li Yuefeng(School of Economics and Management, Hubei University of Technology, Wuhan 430064)
机构地区:[1]湖北工业大学经济与管理学院,武汉430064
出 处:《情报杂志》2019年第3期203-206,I0001,186,共6页Journal of Intelligence
基 金:国家自然科学基金项目"微博环境下实时主动感知网络舆情事件的多核方法研究"(编号:71303075)及"大数据环境下基于特征本体学习的无监督文本分类方法研究"(编号:71571064)研究成果之一
摘 要:[目的/意义]稀疏自编码器是深度学习领域中一种较为高效的文本特征提取方法,有利于解决大规模涉恐短文本高维、稀疏难处理等问题。[方法/过程]首先经稀疏自编码器无监督学习方法降维,提取数据隐含特征,然后利用LDA主题聚类算法进行文本聚类,并通过与传统特征提取算法对比实验效果来验证该方法的有效性和高效性。[结果/结论]实验结果证明,将稀疏自编码器提取的文本特征用于LDA主题聚类,有效解决了涉恐短文本高维、稀疏、噪声大的问题,并显著提高了聚类结果的准确性。[Purpose/Significance] Sparse self-encoder is a more efficient method of text feature extraction in the field of deep learning, it is conducive to solving high-dimensional, sparse and other difficult problems of large-scale terrorism short texts.[Method/Process] Firstly, the unsupervised learning method of sparse auto-encoder is used to reduce the dimension, and the hidden features of data are extracted. Then the clustering algorithm of LDA topic is used to cluster texts, and the effectiveness and efficiency of the method are verified by comparing the experimental results with the traditional feature extraction algorithm.[Result/Conclusion] The experimental results prove that using sparse auto-encoder extracted text features for LDA topic clustering can effectively solve the problem of high-dimensional, sparse, and loud noises in short texts related to terrorism, and significantly improve the accuracy of clustering results.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249