采用特征分辨率和等价类相关矩阵的特征选择  被引量:1

Feature Selection by Applying Feature Resolution and Correlation Matrix of Equivalence Classes

在线阅读下载全文

作  者:符红霞[1] 黄成兵[1] 

机构地区:[1]阿坝师范高等专科学校计算机科学系,汶川623002

出  处:《科学技术与工程》2012年第34期9234-9237,9242,共5页Science Technology and Engineering

基  金:阿坝师范高等专科学校校级科研项目(ASB12-23)资助

摘  要:特征选择是文本分类的关键步骤之一,所选特征子集的优劣直接影响文本分类的结果。首先分析了词频和文档频并在此基础上对文档频进行优化。然后又以此为基础提出了特征分辨率并先用它初选文本特征。紧接着又把粗糙集引入进来并给出了一个基于等价类相关矩阵的属性约简算法,以此来进一步消除冗余特征。仿真结果表明上述方法无论是在精确度和召回率方面,还是时间性能及平均分类精度方面,都具有一定的优势。Feature selection is one of the key steps in text categorization, selected feature subset directly influ- ences results of text categorization. Firstly, word frequency and document frequency were analyzed, and an im- proved document frequency was improved. And then, feature resolution was presented based on the improved docu- ment frequency. Subsequently, rough sets were introduced into feature selection and a new attribute reduction algo- rithm based on correlation matrix of equivalence classes was provided. Finally, combining feature resolution with the provided attribute reduction algorithm, a new feature selection method was proposed. The new feature selection method firstly uses feature resolution to select text features and filter out some terms to reduce the sparsity of text feature spaces, and then employs the provided attribute reduction algorithm to eliminate redundancy. The simula- tion results show that the proposed feature selection method to a certain extent has advantages in precision rate, re- call rate, time performance and average classification accuracy.

关 键 词:特征选择 文本分类 特征分辨率 粗糙集 相关矩阵 

分 类 号:TP391.43[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象