基因表达数据在邻域关系中的特征选择  被引量:3

Gene expression data feature selection with neighborhood relation

在线阅读下载全文

作  者:陈玉明[1] 吴克寿[1] 李向军[2] 

机构地区:[1]厦门理工学院计算机科学与技术系,福建厦门361024 [2]南昌大学计算机科学与技术系,江西南昌330031

出  处:《智能系统学报》2014年第2期210-213,共4页CAAI Transactions on Intelligent Systems

基  金:国家自然科学青年基金资助项目(61103246)

摘  要:基因特征选择是基因表达数据分析中的一种重要方法。粗糙集是一种处理不确定性、不一致性、不精确性数据的有效分类工具,其特点是保持基因表达数据集的分类能力不变,进行基因特征选择。为了避免传统粗糙集特征选择方法所必需的离散化过程带来的信息损失,将邻域粗糙集特征选择方法应用于基因的特征选取,提出了基于邻域粗糙集的基因选择方法。该方法从所有特征出发,根据特征重要度逐步删除冗余的特征,最后得到关键特征组进行分类研究。在2个标准的基因表达数据集上进行特征选取,并进行了分类实验,实验结果表明该方法是有效可行的。The selection of an efficient gene feature is a key procedure for analysis of gene expression data. The rough set theory is an efficient classification tool to deal with uncertain, inconsistent and inaccurate gene data. One limitation of the rough set theory is the lack of effective methods for processing real valued data. However, gene ex- pression data sets are always continuous. Discrete methods can result in information loss. This paper investigates an approach to the selection of gene feature on the basis of the neighborhood rough set theory. Starting from all the fea- tures, this approach gradually removes the redundant features, and finally gets the key features of the group classifi- cation study based on the importance degree of characteristics. To evaluate the performance of the proposed ap- proach, we applied it to two bench mark gene expression data sets which were compared to certain aspects of the feature selections. The experimental results illustrate that our algorithm is more effective for selecting high discrimi- native genes in cancer classification tasks.

关 键 词:粗糙集 邻域关系 基因表达数据 特征选择 分类 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象