基于深度学习的科技文献语义分类研究  被引量:12

Research on Semantic Classification of Scientific and Technical Literature Based on Deep Learning

在线阅读下载全文

作  者:谢红玲 奉国和[1] 何伟林 

机构地区:[1]华南师范大学经济与管理学院信息管理系,广东广州510006

出  处:《情报理论与实践》2018年第11期149-154,共6页Information Studies:Theory & Application

基  金:2016年国家社会科学基金项目"基于文本挖掘的科技文献知识发现研究"(项目编号:16BTQ071);2016年华南师范大学研究生创新项目"基于深度学习的科技文献挖掘研究"(项目编号:2016wkxm62)的成果

摘  要:[目的/意义]科技文献数量增长迅猛,自动文本分类技术可以提高文献分类效率与准确率。深度学习在自然语言语义分析中效果明显,基于深度学习的语义分析可以对科技文献进行有效分类。[方法/过程]为了进行对比实验,分别对科技文献数据做了去停用词和不去停用词处理,再用Word2vec工具进行词向量训练,使用简单RNN,LSTM和GRU深度学习模型进行分类比较。[结果/结论]实验结果表明,简单RNN,LSTM和GRU均对未去停用词的科技文献分类效果较好;三个深度学习模型中LSTM的分类效果最好,使用简单RNN和LSTM进行科技文献的语义分类时,Adam和SGD优化器对模型的优化效果最好;使用GRU时SGD和Adadelta优化器对模型的优化效果最好。[ Purpose/significance] As the number of scientific and technical (S & T) literature increases rapidly, automatic text classification technology can improve the efficiency and accuracy of literature classification. The effect of deep learning in the natural language semantic analysis is obvious, so semantic analysis based on the deep learning can effectively classify the S & T lit- erature. [ Method/process I In order to carry out the comparison experiments, this paper processes the S & T literature data by re- moving stop words and without removing the stop words. Then the word2vec is used to train the word vector, and the simple RNN, LSTM and GRU deep learning models are used to compare classifications. [ Result/conclusion ] The experimental results show that the simple RNN, LSTM and GRU have better classification effect on the S & T literature that does not remove the stop words. The classification effect of LSTM is the best in the three deep learning models. The Adam and SGD optimizers have the best effect on the model when using simple RNN and LSTM for semantic classification of S & T literature. The SGD and Adadeha optimizers have the best effect on the model when using GRU for semantic classification of S & T literature.

关 键 词:科技文献 文献分类 深度学习 语义分析 停用词处理 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术] TP181[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象