Gene2DGE: A Perl Package for Gene Model Renewal with Digital Gene Expression Data  

Gene2DGE: A Perl Package for Gene Model Renewal with Digital Gene Expression Data

在线阅读下载全文

作  者:Xiaoli Tang Libin Deng Dake Zhang Jiari Lin Yi Wei Qinqin Zhou Xiang Li Guilin Li Shangdong Liang 

机构地区:[1]Faculty of Basic Medical Science,Nanchang University,Nanchang 330006,China [2]Institute of Translational Medicine,Nanchang University,Nanchang 330006,China [3]Beijing Institute of Genomics,Chinese Academy of Sciences,Beijing 100029,China

出  处:《Genomics, Proteomics & Bioinformatics》2012年第1期51-54,共4页基因组蛋白质组与生物信息学报(英文版)

基  金:supported by the National Nature Science Foundation of China (Grant No. 81171184, 31060139 and 30871384);Nature Science Foundation of Jiangxi Province (Grant No. 20114BAB215019);Department of Health of Jiangxi Province (Grant No. 20111209);Technology Pedestal and Society Development Project of Jiangxi Province (Grant No. 2010BSA09500 and 20111BBG70009-1)

摘  要:For transcriptome analysis, it is critical to precisely define all the transcripts across the whole genome. More and more digital gene expression (DGE) scannings have indicated the presence of huge amount of novel transcripts in addition to the known gene models. However, almost all these studies still depend crucially on existing annotation. Here, we present Gene2DGE, a Perl software package for gene model renewal with DGE data. We applied Gene2DGE to the mouse blastomere transcriptome, and defined 98,532 read-enriched regions (RERs) by read clustering supported by more than four reads for each base pair. Taking advantage of this ab initio method, we refined 2,104 exonic regions (4% of a total of 48,501 annotated transcribed regions) with remarkable extension into un-annotated regions (〉50 bp). For 5% of uniquely mapped reads falling within intron regions, we identified 13,291 additional possible exons. As a result, we renewed 4,788 gene models, which account for 39% of a total of 12,277 transcribed genes. Furthermore, we identified 12,613 intergenic RERs, suggesting the possible presence of novel genes outside the existing gene models. In this study, therefore, we have developed a suitable tool for renewal of known gene models by ab initio prediction in transcriptome dissection. The Gene2DGE package is freely available at http://bighapmap.big.ac.cn/.For transcriptome analysis, it is critical to precisely define all the transcripts across the whole genome. More and more digital gene expression (DGE) scannings have indicated the presence of huge amount of novel transcripts in addition to the known gene models. However, almost all these studies still depend crucially on existing annotation. Here, we present Gene2DGE, a Perl software package for gene model renewal with DGE data. We applied Gene2DGE to the mouse blastomere transcriptome, and defined 98,532 read-enriched regions (RERs) by read clustering supported by more than four reads for each base pair. Taking advantage of this ab initio method, we refined 2,104 exonic regions (4% of a total of 48,501 annotated transcribed regions) with remarkable extension into un-annotated regions (〉50 bp). For 5% of uniquely mapped reads falling within intron regions, we identified 13,291 additional possible exons. As a result, we renewed 4,788 gene models, which account for 39% of a total of 12,277 transcribed genes. Furthermore, we identified 12,613 intergenic RERs, suggesting the possible presence of novel genes outside the existing gene models. In this study, therefore, we have developed a suitable tool for renewal of known gene models by ab initio prediction in transcriptome dissection. The Gene2DGE package is freely available at http://bighapmap.big.ac.cn/.

关 键 词:TRANSCRIPTOME ANNOTATION ab initio prediction 

分 类 号:Q-332[生物学] TP391.72[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象