检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:翟文洁[1] 闫琰[2] 张博文[1] 殷绪成[1]
机构地区:[1]北京科技大学计算机科学与技术系,北京100083 [2]中国矿业大学计算机科学与技术系,北京100083
出 处:《情报工程》2016年第5期30-40,共11页Technology Intelligence Engineering
基 金:国家自然科学基金项目:结合前馈和反馈机制的自然场景文本识别技术(61473036)资助
摘 要:本文开展了基于混合深度信念网络的多类文本表示与分类方法的研究,以解决传统的Bag-ofWords(BOW)表示方法忽略文本语义信息、特征提取存在高维度高稀疏的问题。文章基于文本关键字,针对多类的分类任务(如新闻文本和生物医学文本),以关键字的词向量表示作为文本输入,同时结合深度信念网络(Deep Belief Network,DBN)和深度玻尔兹曼机网络(Deep Boltzmann Machine,DBM),设计了一种混合深度信念网络(Hybrid Deep Belief Network,HDBN)模型。文本分类和文本检索的实验结果表明,基于词向量嵌入的深度学习模型在性能上优于传统方法。此外,通过二维空间可视化实验,由HDBN模型提取的高层文本表示具有高内聚低耦合的特点。This paper developed a model for text representation and classiifcation based on hybrid deep belief networks, in order to solve the problem of traditional text representation methcod (Bag-of-Words), which ignores the semantic relations and whose feature extraction is high-dimensional and high-sparse. Based on the text keywords, we explored the word vector of keywords as the input for multiple classiifcation tasks, such as news and biomedicine texts, and we also proposed a new model—HDBN (Hybrid Deep Belief Network) which is based on the integration of DBN (Deep Belief Network) and DBM (Deep Boltzmann Machine). The results of text categorization and text retrieval showed that the HDBN model can performed better than the traditional methods. Moreover, the results of two-dimensional spatial visualization also indicated that high-level text representation based on the HDBN model presented the character of high cohesion and low coupling.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.80