基于改进Tri-training算法投票机制的中文问句分类  

Chinese Question Classification Based on Improved Voting Mechanism of Tri-training Algorithm

在线阅读下载全文

作  者:王雷 孙中全[1] WANG Lei;SUN Zhong-quan(Chuzhou Polytechnic,Chuzhou 239000,China)

机构地区:[1]滁州职业技术学院,安徽滁州239000

出  处:《长春师范大学学报》2023年第12期60-65,101,共7页Journal of Changchun Normal University

基  金:滁州职业技术学院自科一般项目“半监督学习的中文问句分类研究”(YJY-2020-27);滁州职业技术学院自科重点项目“基于IPv6的校园网络安全防护体系构建与研究”(YJZ-2021-02);安徽省职成教项目“后疫情时代基于OBE理念的高职公共基础课程混合式教学模式的构建与实施”(Azcj2022178)。

摘  要:原始的Tri-training算法在三个分类器给出的分类结果均不同时,默认第一个分类器给出的分类结果为分类器模型的最终结果,这在一定程度上有可能会降低分类器在这种情况下的分类精度。本文提出一种基于平时优秀思想的投票机制算法,该算法避免了默认将第一个分类器给出的结果作为分类器模型的分类结果这种片面的情况,并利用其对哈工大中文问句集和本文扩展问句集进行分类实验。结果表明,本文算法有良好的适应性,且分类正确率明显提高;适当增大训练集和未标记样本数据,可以增强分类器的泛化能力,从而使分类正确率提高。The original Tri-training algorithm defaults to the classification result given by the first classifier as the classification result of the classifier model when all three classifiers give different classification results,which may reduce the classification accuracy of the classifier in this case to some extent.In this paper,we propose a voting mechanism algorithm based on the usual excellent idea,which avoids the one-sided situation that the default result given by the first classifier is the classification result of the classifier model,and use it to conduct classification experiments on the Chinese question set of HIT and the extended question set of this paper.The results show that the algorithm in this paper has good adaptability and the classification correct rate is obviously improved;appropriately increasing the training set and unlabeled sample data can enhance the generalization ability of the classifier,which leads to the improvement of the classification correct rate.

关 键 词:Tri-training算法 投票机制 问句分类 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象