基于互信息规则剪枝的关联文本分类  被引量:1

On Classification of Associative Text Based on Rules Pruning of Mutual Information

在线阅读下载全文

作  者:商炳章[1] 白清源[1] 

机构地区:[1]福州大学数学与计算机科学学院,福建福州350002

出  处:《南京师范大学学报(工程技术版)》2008年第4期173-177,共5页Journal of Nanjing Normal University(Engineering and Technology Edition)

基  金:教育部留学回国人员启动基金;中科院软件所开放课题基金(SYSKF0701);福州大学科技发展基金(2005-XQ-13);福建省教育厅基金(JB06023)资助项目

摘  要:传统的关联文本分类算法产生的规则数量巨大,若不对规则剪枝会影响分类效率,而采用以前的剪枝方法又会使分类精度出现不同程度的下降.为此提出以互信息的方法对每个类的规则进行剪枝,挑选出分类能力强的规则构成分类器,对待分类文本进行分类.经过这个方法剪枝后的规则数量大幅减少,且能取得比规则集未修剪过的分类器和采用以前剪枝方法的ARC-BC算法更好的分类效果,大量的实验表明此方法是有效的.The traditional associative classifying algorithms of associative texts generate a huge mumber of rules. If the rules were not pruned, the efficiency of classification would be influenced. However, if the former pruning method were adopted, different degrees of accuracy of classification would appear. Therefore, an associative text classification algo- rithm-based on rules pruning of mutual information is presented to prune the rules of each class. The rules with high clas- sifying capacity are chosen to form classifiers to classify the texts being classified. The study illuminates that the mutual information-based rules pruning algorithm not only gets much less rules but is more helpful for improving the accuracy of the association categorization. The experimental results show the performance of this method is better than both ARC - BC algorithm and the algorithm which uses all rules.

关 键 词:互信息 规则剪枝 关联分类 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象