中国濒危植物金钱松转录组测序及生物信息学分析  被引量:10

Transcriptome Sequencing and Bioinformatic Analysis of Pseudolarix amabilis,an Endangered Gymnosperm in China

在线阅读下载全文

作  者:张文秀 张丽 寇一翾 张志勇[1] ZHANG Wen-xiu;ZHANG Li;KOU Yi-xuan;ZHANG Zhi-yong(Laboratory of Subtropical Biodiversity,College of Agronomy,Jiangxi Agricultural University,Nanchang 330045,China)

机构地区:[1]江西农业大学农学院/亚热带生物多样性实验室

出  处:《江西农业大学学报》2019年第4期761-772,共12页Acta Agriculturae Universitatis Jiangxiensis

基  金:国家自然科学基金项目(41461008)~~

摘  要:金钱松是中国特有的孑遗单种属裸子植物,现存的自然种群数量很少,多被引种栽培,也是著名的庭院观赏树种。迄今为止,其遗传背景和基因组信息并不清楚,对于金钱松的保护及其遗传结构研究迫切需要基因组资源。采用Illumina HiSeqTM2500高通量测序平台对金钱松叶片进行转录组测序,经de novo组装共获得70 761条Unigene,平均长度为699 bp,N50的长度为1 300 bp,Q20和Q30序列分别占96.59%和91.29%。通过对7个不同的蛋白质和功能域数据库进行比对和功能注释,有43 674条Unigene(61.72%)注释成功。在GO数据库中,有28 355条Unigene按功能被划分成3大类56个小类,以执行生物过程的类区所占比例最多。通过KEGG pathway分析,有14 623条Unigene注释成功,发现了显著性富集的32条代谢通路,以代谢相关的基因最多。在KOG数据库中,有15 931条Unigene被分配到26个基因功能大类中,其中以参与一般功能、转录、翻译、修饰及蛋白运输的基因最为丰富。此外,利用MISA软件对转录组序列进行EST-SSR位点搜索与分析,共检测到2 260条Unigene含有2 462个EST-SSR位点,分布频率为3.48%,其中有180条序列含有一个以上EST-SSR位点,83条序列含有复合EST-SSR位点,以三核苷酸重复基元类型最为丰富,占42.53%(1 047个EST-SSR),重复次数主要以5~8次为主。这些重要的转录组序列为进一步了解金钱松生物学过程的分子机制提供了有价值的信息,并为未来的功能基因组分析、分子标记开发和群体遗传学分析提供了丰富的资源。Pseudolarix amabilis(Nelson)Rehd.(Pinaceae)is a relict gymnosperm endemic to China.Its extant natural populations are very few and most of its populations are introduced and cultivated.It is also a well-known ornamental species in courtyards.Up till now,the genetic and genome information are not clear.Genomic resources are urgently needed for the protection and genetic structure research of pseudolariax amabilis. In the present study,a transcriptome from Pseudolarix amabilis leaves was sequenced by using Illumina HiSeqTM2500.A total of 70 761 unigenes were obtained through de novo assembly,with an average length of 699 bp and N50 of 1 300 bp,Q20 and Q30 sequences accounted for 96.59% and 91.29% respectively.The unigenes were functionally annotated by searching against seven protein databases.A total of 43 674 unigenes(61.72%)were successfully annotated.28 355 unigenes were assigned within 56 terms of three main GO categories,with the largest proportion of class regions performing biological processes.Through KEGG pathway analysis,14 623 unigenes were successfully annotated and 32 metabolic pathways with significant enrichment were found,with the most metabolic related pathways.15 931 unigenes were annotated in the KOG database,and a total of 26 gene functional categories were obtained.Among them,the genes involved in general function,transcription,translation,modification and protein transportation were the most abundant.In addition,a total of 2 642 EST-SSRs from 2 260 unigenes were identified from the transcriptome,the distribution frequency of EST-SSRs was 3.48%,with 121 unigenes containing more than one EST-SSRs.The tri-nucleotide repeat motif was the most abundant,accounting for 42.53%(1 047 EST-SSRs).The repeat type was mainly 5 to8 times.These important sequences provide valuable information for understanding the molecular mechanism of the biological process of pseudolariax amabilis,and provide abundant resources for future functional genome analysis,molecular marker development and population genetics anal

关 键 词:金钱松 转录组 功能注释 生物信息学 EST-SSR 

分 类 号:S718.43[农业科学—林学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象