利用中华蜜蜂工蜂幼虫肠道转录组纳米孔长读段数据完善东方蜜蜂参考基因组序列和功能注释  

Improvement of the sequences and functional annotations of the Apis cerana reference genome with the nanopore long-read data of the gut transcriptome of larval A.cerana cerana workers

在线阅读下载全文

作  者:李坤泽 宋宇轩 臧贺 荆欣 范小雪 陈颖[1] 那志豪 陈大福[1,2,3] 付中民 郭睿[1,2,3] LI Kun-Ze;SONG Yu-Xuan;ZANG He;JING Xin;FAN Xiao-Xue;CHEN Ying;NA Zhi-Hao;CHEN Da-Fu;FU Zhong-Min;GUO Rui(College of Bee Science and Biomedicine,Fujian Agriculture and Forestry University,Fuzhou 350002,China;National&Local United Engineering Laboratory of Natural Biotoxin,Fuzhou 350002,China;Apicultural Research Institute of Fujian Province,Fuzhou 350002,China)

机构地区:[1]福建农林大学蜂学与生物医药学院,福州350002 [2]天然生物毒素国家地方联合工程实验室,福州350002 [3]福建省蜂疗研究所,福州350002

出  处:《昆虫学报》2024年第3期346-357,共12页Acta Entomologica Sinica

基  金:国家自然科学基金项目(32172792,32372943);国家现代农业产业技术体系专项资金项目(CARS-44-KXJ7);福建省自然科学基金面上项目(2022J01131334);福建农林大学硕士生导师团队项目(郭睿);福建农林大学科技创新专项基金(KFb22060XA);福建省大学生创新创业训练计划项目(202310389027,S202310389076)。

摘  要:【目的】将已获得的中华蜜蜂Apis cerana cerana转录组纳米孔长读段数据比对到东方蜜蜂A.cerana参考基因组,进行注释基因的结构优化,鉴定未注释的新基因和新转录本并进行功能注释以及预测其SSR位点、完整ORF和转录因子(transcription factor,TF)家族及成员的分析验证,完善现有的东方蜜蜂参考基因组序列和功能注释。【方法】基于已获得的高质量的接种蜜蜂球囊菌Ascosphaera apis的中华蜜蜂工蜂4,5和6日龄幼虫肠道转录组纳米孔测序数据,使用gffcompare软件将已鉴定到的全长转录本比对到东方蜜蜂参考基因组以优化已注释基因的结构;采用gffcompare软件鉴定参考基因组上未注释的新基因和新转录本,再通过比对Nr,KOG,eggNOG,GO和KEGG数据库进行功能注释;使用MISA,TransDecoder v3.0.0和animalTFDB 2.0软件分别预测SSR位点、完整ORF和TF家族及成员。【结果】共对东方蜜蜂参考基因组上已注释的4648个基因结构进行了优化,对1336个基因同时延长了5′UTR和3′UTR,分别延长了1688个基因的5′UTR和1624个基因的3′UTR;共鉴定到2148个新基因,其中分别有818,298,587,359和333个新基因可注释到Nr,KOG,eggNOG,GO和KEGG数据库;共鉴定到35432条新转录本,其中分别有30974,21222,29025,19852和9214条新转录本可注释到上述5个数据库;共发掘出22541个SSR位点,其中单、双、三和六碱基重复的SSR数量分别为12078,7140,2825和43个,混合SSR的数量为2964个,分布频率最高的类型是单碱基重复(153.37个/Mb);共预测到58个TF家族及1611个成员;共预测出28775个完整ORF,其中编码长度分布在100~200个氨基酸的ORF(38.99%)最多。【结论】研究结果优化了东方蜜蜂参考基因组上已注释基因的结构,并补充了参考基因组上未注释的新基因、新转录本、SSR、完整ORF及TF。【Aim】The obtained nanopore long-read data of Apis cerana cerana transcriptome were compared with the reference genome of A.cerana,and the structures of the annotated genes were optimized.The unannotated new genes and new transcripts were identified and functionally annotated,and their SSR loci,complete ORFs and transcription factor(TF)families and members were predicted and verified,so as to improve the sequence and functional annotations of the reference genome of A.cerana.【Methods】Based on the high-quality transcriptome nanopore sequencing data of the 4-,5-and 6-day-old larvae of A.cerana cerana workers infected with Ascosphaera apis,the identified full-length transcripts were mapped to the reference genome of A.cerana with gffcompare software to optimize the structures of the annotated genes.The unannotated novel genes and transcripts in the reference genome were identified utilizing the gffcompare software and mapped to the Nr,KOG,eggNOG,GO and KEGG databases for functional annotation.MISA,TransDecoder v3.0.0 and animalTFDB 2.0 software were employed to respectively predict the SSR loci,complete ORFs as well as TF families and members.【Results】A total of 4648 annotated genes in the reference genome of A.cerana were structurally optimized,the 5′UTR and 3′UTR of 1336 genes were simultaneously extended,while the 5′UTR of 1688 genes and the 3′UTR of 1624 genes were respectively extended.A total of 2148 novel genes were identified,among which 818,298,587,359 and 333 genes could be annotated to Nr,KOG,eggNOG,GO and KEGG databases,respectively.A total of 35432 novel transcripts were identified,among which 30974,21222,29025,19852,and 9214 could be respectively annotated to the aforementioned five databases.A total of 22541 SSR loci were detected,of which the numbers of SSRs with single,double,three and six base repeat were 12078,7140,2825 and 43,respectively.The number of mixed SSRs was 2964,and the type with the highest distribution frequency was single base repeat(153.37/Mb),and 58 TF families a

关 键 词:东方蜜蜂 中华蜜蜂 第三代测序技术 纳米孔测序 全长转录本 转录组 基因组 

分 类 号:S891[农业科学—特种经济动物饲养]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象