基于TI-LSTM的文本自动分类算法及应用  被引量:4

Research on Automatic Text Classification Based on TI-LSTM

在线阅读下载全文

作  者:陈玉天 陈洋[1] 梁恒瑞 孙绍宇 施三支[1] CHEN Yutian;CHEN Yang;LIANG Hengrui;SUN Shaoyu;SHI Sanzhi(School of Mathematics and Statistics,Changchun University of Science and Technology,Changchun 130022)

机构地区:[1]长春理工大学数学与统计学院,长春130022

出  处:《长春理工大学学报(自然科学版)》2023年第1期130-136,共7页Journal of Changchun University of Science and Technology(Natural Science Edition)

基  金:吉林省教育厅项目(JJKH20210809KJ);长春理工大学大学生创新创业训练计划项目(2021019)。

摘  要:为了有效解决中文文本分类问题,提高文本分类的准确性,提出一种基于TF-IDF和神经网络相结合的文本自动分类算法——TI-LSTM算法。算法根据语义情景提取相应特征,进行量化,通过长短期神经网络(LSTM)对量化后的特征进行训练并赋予权重,最后以特征权重为依据对中文文本信息进行评价。使用TI-LSTM算法可以在保留原文语义的情况下准确提取特征。将该算法应用到长春理工大学贫困生等级分类研究中。与传统的KNN、逻辑回归、朴素贝叶斯和LSTM分类方法进行了比较,训练和测试的准确率都有了较大的提升,准确率达到了86%以上。In order to solve the problem of Chinese text classification and improve the accuracy,a text automatic classification algorithm based on TF-IDF and neural network is proposed named by TI-LSTM algorithm in this paper.Firstly,the corresponding features are extracted and quantified in the algorithm according to the semantic situation.Then the quantified features are trained and weighted with the long-short term neural network(LSTM).Finally,Chinese text information is evaluated based on feature weight.This method has been successfully applied to the classification of poverty-stricken students in our school.Compared with traditional KNN,logistic regression,naive Bayes and LSTM classification methods,the accuracy of training and testing has been greatly improved.The automatic text classification algorithm in TI-LSTM algorithm can extract features accurately with the original text semantic,and the average accuracy rate is over 86%.

关 键 词:神经网络 文本分类 特征提取 文本量化 贫困生 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术] TP183[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象