Descriptors for DNA Sequences Based on Joint Diagonalization of Their Feature Matrices from Dinucleotide Physicochemical Properties  

Descriptors for DNA Sequences Based on Joint Diagonalization of Their Feature Matrices from Dinucleotide Physicochemical Properties

在线阅读下载全文

作  者:Hongjie Yu Deshuang Huang 

机构地区:[1]the Department of Mathematics, School of Science, Anhui Science and Technology University [2]Machine Learning and Systems Biology Laboratory, Tongji University

出  处:《Tsinghua Science and Technology》2013年第5期446-453,共8页清华大学学报(自然科学版(英文版)

基  金:supported by the Key Project from Education Department of Anhui Province (No.KJ2013A076);the PhD Programs Foundation of Ministry of Education of China (No.20120072110040);the National Natural Science Foundation of China (Nos.61133010,31071168,and 61005010);the China Postdoctoral Science Foundation (No.2012T50582)

摘  要:Numerical characterizations of DNA sequence can facilitate analysis of similar sequences. To visualize and compare different DNA sequences in less space, a novel descriptors extraction approach was proposed for numerical characterizations and similarity analysis of sequences. Initially, a transformation method was introduced to represent each DNA sequence with dinucleotide physicochemical property matrix. Then, based on the approximate joint diagonalization theory, an eigenvalue vector was extracted from each DNA sequence,which could be considered as descriptor of the DNA sequence. Moreover, similarity analyses were performed by calculating the pair-wise distances among the obtained eigenvalue vectors. The results show that the proposed approach can capture more sequence information, and can jointly analyze the information contained in all involved multiple sequences, rather than separately, whose effectiveness was demonstrated intuitively by constructing a dendrogram for the 15 beta-globin gene sequences.Numerical characterizations of DNA sequence can facilitate analysis of similar sequences. To visualize and compare different DNA sequences in less space, a novel descriptors extraction approach was proposed for numerical characterizations and similarity analysis of sequences. Initially, a transformation method was introduced to represent each DNA sequence with dinucleotide physicochemical property matrix. Then, based on the approximate joint diagonalization theory, an eigenvalue vector was extracted from each DNA sequence,which could be considered as descriptor of the DNA sequence. Moreover, similarity analyses were performed by calculating the pair-wise distances among the obtained eigenvalue vectors. The results show that the proposed approach can capture more sequence information, and can jointly analyze the information contained in all involved multiple sequences, rather than separately, whose effectiveness was demonstrated intuitively by constructing a dendrogram for the 15 beta-globin gene sequences.

关 键 词:descriptors approximate joint diagonalization dendrogram physicochemical property similarity analysis 

分 类 号:Q523[生物学—生物化学] TN911.7[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象