利用第三代纳米孔长读段测序技术构建和注释蜜蜂球囊菌的全长转录组  被引量:15

Construction and Annotation of Ascosphaera apis Full-Length Transcriptome Utilizing Nanopore Third-Generation Long-Read Sequencing Technology

在线阅读下载全文

作  者:杜宇 祝智威 王杰 王秀娜[3,4] 蒋海宾 范元婵 范小雪 陈华枝 隆琦 蔡宗兵 熊翠玲 郑燕珍 付中民 陈大福 郭睿[1,2] DU Yu;ZHU ZhiWei;WANG Jie;WANG XiuNa;JIANG HaiBin;FAN YuanChan;FAN XiaoXue;CHEN HuaZhi;LONG Qi;CAI ZongBing;XIONG CuiLing;ZHENG YanZhen;FU ZhongMin;CHEN DaFu;GUO Rui(College of Animal Sciences(College of Bee Science),Fujian Agriculture and Forestry University,Fuzhou 350002;Apitherapy Research Institution,Fujian Agriculture and Forestry University,Fuzhou 350002;College of Life Sciences,Fujian Agriculture and Forestry University,Fuzhou 350002;Key Laboratory of Pathogenic Fungi and Mycotoxins of Fujian Province(Fujian Agriculture and Forestry University),Fuzhou 350002)

机构地区:[1]福建农林大学动物科学学院(蜂学学院),福州350002 [2]福建农林大学蜂疗研究所,福州350002 [3]福建农林大学生命科学学院,福州350002 [4]福建省病原真菌与真菌毒素重点实验室(福建农林大学),福州350002

出  处:《中国农业科学》2021年第4期864-876,共13页Scientia Agricultura Sinica

基  金:国家现代农业产业技术体系建设专项(CARS-44-KXJ7);福建省自然科学基金(2018J05042);福建农林大学杰出青年科研人才计划(xjq201814);福建农林大学优秀硕士学位论文资助基金(杜宇);福建省病原真菌与真菌毒素重点实验室开放课题(郭睿);江西省蜜蜂生物学与饲养重点实验室开放基金(JXKLHBB-2020-04)。

摘  要:【目的】利用第三代纳米孔(nanopore)长读段测序技术对蜜蜂球囊菌(Ascosphaera apis,简称球囊菌)的纯化菌丝(Aam)和孢子(Aas)进行测序,构建和注释球囊菌的高质量全长转录组。【方法】通过Oxford Nanopore PromethION平台对Aam和Aas进行测序。利用Guppy软件对原始读段(raw reads)进行碱基识别(base calling),通过过滤短片段和低质量原始读段得到有效读段(clean reads)。通过识别两端引物鉴定全长转录本序列。通过比对Nr、Swissprot、KOG、eggNOG、Pfam、GO和KEGG数据库获得全长转录本的注释信息。分别利用CPC、CNCI、CPAT、Pfam 4种方法对长链非编码RNA(long non-coding RNA,lncRNA)进行预测,取四者的交集作为高可信度的lncRNA。【结果】Aam和Aas的纳米孔测序分别测得6321704和6259727条原始读段,经质控得到5669436和6233159条有效读段,其中包含的全长有效读段分别为4497102(79.32%)和4963101(79.62%)条。共鉴定到9859和16795条非冗余全长转录本,N50分别为1482和1658 bp,平均长度分别为1187和1303 bp,最大长度分别为6472和6815 bp。Venn分析结果显示有6512条非冗余全长转录本为菌丝和孢子所共有,分别有3347和10283个非冗余全长转录本为二者特有。此外,在球囊菌菌丝和孢子中共鉴定到20142条全长转录本,其中分别有20809、11151、17723、12164、11340和9833条全长转录本可注释到Nr、KOG、eggNOG、Pfam、GO和KEGG数据库。注释全长转录本数量最多的物种是球囊菌、Polytolypa hystricis和荚膜组织胞浆菌(Histoplasma capsulatum)。GO数据库注释结果显示,上述全长转录本可注释到45个功能条目,涉及细胞组件、细胞和细胞器等细胞组分相关条目;催化活性、结合和转运器活性等分子功能相关条目;以及细胞进程、代谢进程和单一组织进程等生物学进程相关条目。KEGG数据库注释结果显示,上述全长转录本还可注释到抗生素的生物合成、核糖体、氨基酸的�【Objective】Purified mycelia sample(Aam) and spore sample(Aas) were sequenced using third-generation nanopore long-read sequencing technology, followed by construction and annotation of high-quality full-length transcriptome.【Method】Aam and Aas were respectively sequenced using Oxford Nanopore PromethION platform. Guppy software was used to conduct base calling of raw reads. Clean reads were obtained after filtering out short fragments and low-quality raw reads. Full-length transcripts were identified by recognizing primers at both ends of clean reads. Full-length transcripts were aligned to Nr, Swissprot, KOG, egg NOG, Pfam, GO and KEGG databases to gain corresponding annotations. Four approaches such as CPC, CNCI, CPAT, and Pfam were used to predict lncRNAs, and the intersection was deemed to be high-reliability lncRNAs.【Result】In total, 6 321 704 and 6 259 727 raw reads were yielded from nanopore sequencing of Aam and Aas, and after quality control, 5 669 436 and 6 233 159 clean reads were obtained, including 4 497 102(79.32%) and 4 963 101(79.62%) full-length clean reads. Additionally, 9 859 and 16 795 non-redundant full-length transcripts were identified, with a N50 of 1 482 and 1 658 bp, an average length of 1 187 and 1 303 bp, and a maximum length of 6 472 and 6 815 bp, respectively. Venn analysis showed that 6 512 non-redundant full-length transcripts were shared by Aam and Aas, while 3 347 and 10 283 ones were specific for Aam and Aas, respectively. Besides, a total of 20 142 full-length transcripts were identified in Aam and Aas, among them 20 809, 11 151, 17 723, 12 164, 11 340 and 9 833 full-length transcripts could be annotated to Nr, KOG, eggNOG, Pfam, GO and KEGG databases, respectively. Most of full-length transcripts were annotated to A. apis, Polytolypa hystricis and Histoplasma capsulatum. Moreover, GO database annotation demonstrated that the above-mentioned full-length transcripts could be annotated to 45 functional terms, involving in cell component-associated terms such as cell p

关 键 词:第三代高通量测序技术 纳米孔测序 全长转录本 参考转录组 蜜蜂 蜜蜂球囊菌 

分 类 号:S895.137[农业科学—特种经济动物饲养]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象