检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]阜阳师范学院计算机与信息工程学院,安徽阜阳236037 [2]合肥工业大学计算机与信息学院,合肥230009
出 处:《小型微型计算机系统》2015年第6期1209-1213,共5页Journal of Chinese Computer Systems
基 金:国家自然科学基金项目(51174257/F030504)资助;中央高校基本科研业务费专项资金项目(2013BHZX0040)资助;安徽省级科研机构委托专项重点项目(2013WLGH01ZD)资助
摘 要:随着基因芯片技术的发展,基因表达实验获得了大量的微阵列相关数据,为人类疾病研究提供了一种全新的手段.然而,由于微阵列数据存在维数高、噪声大及冗余度高等特点,给深入准确地挖掘微阵列数据中所蕴含的知识和信息基因选择带来了极大困难.本文提出一种面向高维微阵列数据的混合特征选择算法,该算法分为两层:第一层使用信噪比方法计算全部基因的信噪比值,根据信噪比值选择指定数目的信息基因,过滤无关基因;第二层使用改进的Lasso方法对第一层得到的信息基因候选子集进行特征选择,剔除冗余基因.实验结果表明本文提出的算法能够选择出数量较少且分类能力较强的信息基因,并且性能稳定、泛化能力强,是一种有效的基因特征选择算法.With the development of microarray technology, massive microarray data is produced by gene expression experiments, and it provides a new approach for the study of human disease. Due to the characteristics of high dimensionality, much noise and data redun- dancy for microarray data,it is difficult to mine knowledge from microarray data profoundly and accurately, and it also brings enor- mous difficulty to informative genes selection. Therefore,a hybrid feature selection algorithm for high dimensional microarray data is proposed in this paper,which mainly involves two steps. In the first step,Signal Noise Ratio is applied to calculate all genes,and ac- cording to the Signal Noise Ratio value, select informative genes as candidate genes subset and eliminate irrelevant genes. In the second step, an improved method based on Lasso is employed to select informative genes from candidate genes subset, which aims to eliminate the redundant genes. Experimental results show that the proposed algorithm can select fewer genes, and it has better classification abili- ty, stable performance and strong generalization ability. It is an effective genes feature selection algorithm.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.4