Heuristic feature selection method for clustering  

一种启发式聚类特征选择方法(英文)

在线阅读下载全文

作  者:徐峻岭[1] 徐宝文[1,2] 张卫丰[1,3] 崔自峰[1] 

机构地区:[1]东南大学计算机科学与工程学院 [2]武汉大学软件工程国家重点实验室,武汉430072 [3]南京邮电大学计算机学院,南京210003

出  处:《Journal of Southeast University(English Edition)》2006年第2期169-175,共7页东南大学学报(英文版)

基  金:TheNationalNaturalScienceFoundationofChina(No.60425206,60503020,60373066,60403016),theNationalBasicRe-searchProgramofChina(973Program)(No.2002CB312000),theNat-uralScienceFoundationofJiangsuProvince(No.BK2005060),theNaturalScienceFoundationofJiangsuHigherEducationInstitutions(No.04KJB520096).

摘  要:In order to enable clustering to be done under a lower dimension, a new feature selection method for clustering is proposed. This method has three steps which are all carried out in a wrapper framework. First, all the original features are ranked according to their importance. An evaluation function E(f) used to evaluate the importance of a feature is introduced. Secondly, the set of important features is selected sequentially. Finally, the possible redundant features are removed from the important feature subset. Because the features are selected sequentially, it is not necessary to search through the large feature subset space, thus the efficiency can be improved. Experimental results show that the set of important features for clustering can be found and those unimportant features or features that may hinder the clustering task will be discarded by this method.为了使聚类可以在低维数据空间中进行,提出了一种新的聚类特征选择方法.该方法分3个步骤,每个步骤都在一个wrapper框架中执行.首先,将所有原始特征都按照重要性进行排序,引入一个特征重要性评价函数E(f);然后,顺序地选择特征组成重要特征子集;最后,去除重要特征子集中可能存在的冗余特征.由于是顺序选择特征而不是在巨大的特征子集空间中进行搜索,因此算法效率很高.实验结果表明该方法可以找出有助于聚类的重要特征子集,并且可以去掉那些不利于聚类的特征.

关 键 词:feature selection CLUSTERING unsupervised learning 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象