Fisher线性判别函数在基于COGs分类的基因组间距离研究中的应用  被引量:2

The Application of Fisher Linear Discriminant to Distance Between Genomes Which Based on COGs

在线阅读下载全文

作  者:刘蓉[1] 王月兰[2] 朱小蓬[2] 凌伦奖[2] 韩汝珊[1] 

机构地区:[1]北京大学物理系,北京100871 [2]中国科学院生物物理研究所,北京100101

出  处:《生物化学与生物物理进展》2002年第5期760-765,共6页Progress In Biochemistry and Biophysics

基  金:国家自然科学基金资助项目 (3 9890 0 70 ;19890 3 80 ;3 9993 42 0 ) ;中国科学院创新工程项目 (KSCX2 2 0 7;KJCX1 0 8);北京市科委特别资助项目~~

摘  要:利用全基因组信息构建系统发育树 .基于COGs类 ,对每一个基因组的每一个基因 ,都用一个 17维的向量来描述其编码蛋白隶属于 17个COGs类的程度 ;而与一个基因组的所有基因相对应的那些矢量就组成一个集合 .接着 ,利用Fisher线性判别函数 ,寻找一组最优化的权重因子 ;在此基础上利用Fisher线性变换将上述各集合中每一个矢量进行线性变换 .使得经Fisher线性变换后 17个COGs类对基因组进化的重要程度得到更准确的反映 .最后 ,用进行变换后的矢量组成的集合间的距离代替基因组之间的距离 .使用这种方法 ,分别用 38个和 4 3个基因组做的进化树都支持了Woese的三界理论 .该方法克服了其他基于全基因组信息构建系统发育树方法难以对大小相差很大的基因组进行比较的问题 。A new method to construct a phytogeny tree based on whole genome information is introduced. Each gene of an organism is represented by a 17 dimensional vector, each dimension of which relates to one of the 17 COGs (clusters of orthologous groups of proteins) classes. All the vectors of a genome constitute a set. Then Fisher linear discriminant was used to find a set of optimal weights which reflect more accurately the different contribution of the 17 COGs classes to the genome's evolution. That is, under the Fisher criteria, each vector of a genome is linear mapped. After that, the distance between two genomes was represented by the distance between the related two sets constituted by mapped vectors. At last, the distance matrix was used to construct a phylogenetic tree by PHILP software package. Phylogeny trees of 38 and 43 genomes constructed by this method respectively well support the 'three primary kingdom' theory of Woese. This method rectifies the shortcoming of other methods which are difficult to compare genomes differring remarkably in genome size. In addition, the method diminishes the distortion on the distances between genomes brought by lateral gene transfer.

关 键 词:Fisher线性判别函数 COGs 分类 基因组间距离 应用 

分 类 号:Q7[生物学—分子生物学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象