检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]辽宁科技大学电信学院,鞍山114044 [2]东北大学,沈阳110004
出 处:《计算机科学》2007年第8期171-176,共6页Computer Science
摘 要:DNA微阵列技术使同时监测成千上万的基因表达水平成为可能。直接把传统聚类算法用于高维基因表达数据分析会受到"维难"的困扰。特征转换和特征选择是两种常用的降维方式,但前者产生的新特征难以用原来的领域知识解释,后者通常会丢失信息。另外,传统的聚类算法通常由用户指定聚类参数,参数设置不同对聚类结果有很大的影响。针对上述问题,本文提出了一种新的基于迭代扩张的微阵列数据聚类算法-CIS。它不采用特征转换和特征选择的方式,并自动确定聚类参数。CIS反复用最新得到的样本聚簇得到新的聚类基因,然后以新的基因聚簇为特征重新聚类样本,逐步求精,最终的结果容易解释且避免了信息的丢失。该方法降低了由于用户缺少领域知识引起的实验误差。CIS算法被应用于两个真实的微阵列数据集,实验结果证实了算法的有效性。DNA Micro-array technique makes it possible to simultaneously monitor the expression levels of tens of thousands of genes. The traditional clustering methods will suffer from the curse of dimensionality when directly applied to Micro-array data. The two common dimensionality reduction methods, i.e. feature transformation and feature selection, are unsuitable for the analysis of Micro-array data, since the former generates the new features difficult to interpret and the latter misses some information. Besides, most traditional clustering algorithms need the user-specific parameters, which may result in quite different results. In this paper, we present an iterative spread-based algorithm, namely CIS, for clustering Micro-array data, which selects threshold automatically. Instead of feature selection and feature transformation, in a progressively refining manner, CIS repeatedly partitions the genes with the new-generated sample clusters as features, and then partitions the samples with the new-generated gene clusters as features. The algorithm is applied to two real gene Micro-array data sets. Experiment results confirm its effectiveness and efficiency.
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.30