SUBic:A Scalable Unsupervised Framework for Discovering High Quality Biclusters  

SUBic:A Scalable Unsupervised Framework for Discovering High Quality Biclusters

在线阅读下载全文

作  者:Jooil Lee Yanhua Jin Won Suk Lee 

机构地区:[1]Department of Computer Science, Yonsei University

出  处:《Journal of Computer Science & Technology》2013年第4期636-646,共11页计算机科学技术学报(英文版)

基  金:supported by the National Research Foundation of Korea (NRF) funded by the Ministry of Education,Science and Technology (MEST) of Korea under Grant No. 2011-0016648

摘  要:A biclustering algorithm extends conventional clustering techniques to extract all of the meaningful subgroups of genes and conditions in the expression matrix of a microarray dataset. However, such algorithms are very sensitive to input parameters and show poor scalability. This paper proposes a scalable unsupervised biclustering framework, SUBic, to find high quality constant-row biclusters in an expression matrix effectively. A one-dimensional clustering algorithm is proposed to partition the attributes, that is, columns of an expression matrix into disjoint groups based on the similarity of expression values. These groups form a set of short transactions and are used to discover a set of frequent itemsets each of which corresponds to a bicluster. However, a bicluster may include any attribute whose expression value is not similar enough to others, so a bicluster refinement is used to enhance the quality of a bicluster by removing those attributes based on its distribution of expression values. The performance of the proposed method is comparatively analyzed through a series of experiments on synthetic and real datasets.A biclustering algorithm extends conventional clustering techniques to extract all of the meaningful subgroups of genes and conditions in the expression matrix of a microarray dataset. However, such algorithms are very sensitive to input parameters and show poor scalability. This paper proposes a scalable unsupervised biclustering framework, SUBic, to find high quality constant-row biclusters in an expression matrix effectively. A one-dimensional clustering algorithm is proposed to partition the attributes, that is, columns of an expression matrix into disjoint groups based on the similarity of expression values. These groups form a set of short transactions and are used to discover a set of frequent itemsets each of which corresponds to a bicluster. However, a bicluster may include any attribute whose expression value is not similar enough to others, so a bicluster refinement is used to enhance the quality of a bicluster by removing those attributes based on its distribution of expression values. The performance of the proposed method is comparatively analyzed through a series of experiments on synthetic and real datasets.

关 键 词:BICLUSTERING CLUSTERING expression matrix frequent itemset sub-matrix 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象