Computational prediction of over-annotated protein-coding genes in the genome of Agrobacterium tumefaciens strain C58  被引量:1

Computational prediction of over-annotated protein-coding genes in the genome of Agrobacterium tumefaciens strain C58

在线阅读下载全文

作  者:于家峰 隋天翔 王红梅 王春玲 荆莉 王吉华 

机构地区:[1]Shandong Provincial Key Laboratory of Functional Macromolecular Biophysics, Institute of Biophysics, Dezhou University [2]State Key Laboratory of Bioelectronics, Southeast University [3]College of Life Science, Shandong Normal University [4]College of Physics and Electronic Information, Dezhou University

出  处:《Chinese Physics B》2015年第12期98-104,共7页中国物理B(英文版)

基  金:Project supported by the National Natural Science Foundation of China(Grant Nos.61302186 and 61271378);the Funding from the State Key Laboratory of Bioelectronics of Southeast University

摘  要:Agrobacterium tumefaciens strain C58 is a type of pathogen that can cause tumors in some dicotyledonous plants.Ever since the genome of A. tumefaciens strain C58 was sequenced, the quality of annotation of its protein-coding genes has been queried continually, because the annotation varies greatly among different databases. In this paper, the questionable hypothetical genes were re-predicted by integrating the TN curve and Z curve methods. As a result, 30 genes originally annotated as "hypothetical" were discriminated as being non-coding sequences. By testing the re-prediction program 10 times on data sets composed of the function-known genes, the mean accuracy of 99.99% and mean Matthews correlation coefficient value of 0.9999 were obtained. Further sequence analysis and COG analysis showed that the re-annotation results were very reliable. This work can provide an efficient tool and data resources for future studies of A. tumefaciens strain C58.Agrobacterium tumefaciens strain C58 is a type of pathogen that can cause tumors in some dicotyledonous plants.Ever since the genome of A. tumefaciens strain C58 was sequenced, the quality of annotation of its protein-coding genes has been queried continually, because the annotation varies greatly among different databases. In this paper, the questionable hypothetical genes were re-predicted by integrating the TN curve and Z curve methods. As a result, 30 genes originally annotated as "hypothetical" were discriminated as being non-coding sequences. By testing the re-prediction program 10 times on data sets composed of the function-known genes, the mean accuracy of 99.99% and mean Matthews correlation coefficient value of 0.9999 were obtained. Further sequence analysis and COG analysis showed that the re-annotation results were very reliable. This work can provide an efficient tool and data resources for future studies of A. tumefaciens strain C58.

关 键 词:Agrobacterium tumefaciens strain C58 protein-coding gene genome re-annotation graphical representation 

分 类 号:Q933[生物学—微生物学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象