基于Stacking集成学习的中文问句分类算法  被引量:1

Chinese question classification algorithm based on stacking integrated learning

在线阅读下载全文

作  者:刘佳梅 丁楷 LIU Jiamei;DING Kai(Information Research Center,The Sixth Academy of China Aerospace Science and Industry Corporation,Hohhot 010000,China)

机构地区:[1]中国航天科工集团六院情报信息研究中心,呼和浩特010000

出  处:《智能计算机与应用》2023年第9期85-88,共4页Intelligent Computer and Applications

摘  要:为提升中文问句分类的效果,改善单模型问句分类受训练数据及模型参数影响大、场景适应性差、泛化能力弱等问题,本文提出一种基于Stacking集成学习的中文问句分类算法。模型使用集成学习Stacking框架,融合LightGBM、XGBoost和Random Forest构建多基分类器,并利用Logistic Regression作为元分类器,实现中文问句分类,以提高模型的泛化能力,并提升分类精度。通过网络开源中文问句数据集对模型进行训练和验证,实验结果表明,本文提出的基于Stacking的中文问句分类模型相比于最优LightGBM单模型,在F1值上提升了2.82%。因此,基于Stacking集成学习的中文问句分类算法能够有效提升中文问句分类的精度,支撑问答系统实现更好的性能。In order to improve the effect of Chinese question classification,improve the problem that single model question classification is greatly affected by training data and model parameters,poor scene adaptability,and weak generalization ability.This paper proposes a Chinese question classification algorithm based on Stacking integrated learning.The model uses the integrated learning Stacking framework to integrate LightGBM,XGBoost,and Random Forest to build a multi-base classifier,and uses Logistic Regression as a meta-classifier to achieve Chinese question classification,so as to improve the generalization ability of the model and improve the classification accuracy.The model is trained and verified by using the network open-source Chinese question dataset.The experimental results show that the Chinese question classification model based on Stacking proposed in this paper improves the F1 value by 2.82%compared with the optimal LightGBM single model.Therefore,the Chinese question classification algorithm based on Stacking integrated learning can effectively improve the accuracy of Chinese question classification and support the question-answering system to achieve better performance.

关 键 词:问答系统 中文问句分类 集成学习 STACKING 

分 类 号:G356[文化科学—情报学] TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象