大猿叶甲转录组测序及生物学信息分析  被引量:5

Transcriptome Sequencing and Bio-information Analysis of Colaphellus bowringi

在线阅读下载全文

作  者:沙君雪 迟宝岩 张金波[1] 李阳阳[1] 史琛琛 倪鹤嘉 李海涛[1,2] 高继国 SHA Jun-Xue;CHI Bao-Yan;ZHANG Jin-Bo;LI Yang-Yang;SHI Chen-Chen;NI He-Jia;LI Hai-Ta;GAO Ji-Guo(College of Life Science, Northeast Agricultural University, Harbin 150030, China;Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China)

机构地区:[1]东北农业大学生命科学学院,哈尔滨150030 [2]中国农业科学院植物保护研究所,北京100193

出  处:《农业生物技术学报》2018年第6期978-986,共9页Journal of Agricultural Biotechnology

基  金:黑龙江省自然科学基金(No.C2016025);植物病虫害生物学国家重点实验室开放基金(No.SKLOF201705)

摘  要:大猿叶甲(Colaphellus bowringi)是一种常见的蔬菜害虫,由于目前其全基因组尚未公布,各种相关基因信息无法得知,导致大猿叶甲基因功能研究进展缓慢。本研究利用新一代高通量测序技术结合生物信息学分析,获得了大猿叶甲完整的转录组数据,并将获得的原始数据上传至NCBI获得ID号:SRP125298。共获得40 489条Unigenes,利用Blast X比对NCBI蛋白数据库(NCBI Non-redundant Protein Sequences,NR)、京都基因与基因组百科全书(Kyoto Encyclopedia of Genes and Genomes,KEGG)、Swiss-Prot蛋白质序列数据库(Manually Annotated and Reviewed Protein Sequence Database,Swiss-Prot)、蛋白质家族的集合数据库(Protein Family,Pfam)、NCBI核酸序列数据库(NCBI Nucleotide Sequences,NT)、基因本体数据库(Gene Ontology,GO)、真核生物蛋白质同源簇数据库(eu Karyotic Ortholog Groups,KOG),对所有Unigenes进行注释,结果显示共计19 955个Unigenes得到注释。分别有17 437、4 158、6 817、12 129、15 890、8 641、11 416个Unigenes注释到上述公共数据库。对所有Unigenes进行SSR分析发现,共计2 992个SSR存在于2 692条Unigenes上,平均长度14.61 bp。其中单核苷酸和三核苷酸所占比例最大,分别为65.51%和21.76%,二者分别以A/T和TTA/TAA重复基元为主(分别占94.95%和8.45%)。研究发现共有115种重复基元,其中二、三、四核苷酸重复基元分别有12、60、38种;基序重复频率介于5~50次,基序长度主要分布于10~20 bp。本研究通过对大猿叶甲转录组原始数据的获得,为种属鉴定及遗传多样性的分析提供基础资料。Colaphellus bowringi is a common vegetable pest.As the current Colaphellus bowringi full genome has not yet been announced,the various related gene information can not be known,leading to the Colaphellus bowringi gene function research is slow.The complete transcriptome data of Colaphellus bowringi was obtained by a new generation of high-throughput sequencing.Transcriptome data was analyzed with bioinformatics analysis.A complete transcriptomic database of Colaphellus bowringi was obtained,and the obtained original data was uploaded to National Center for Biotechnology Information(NCBI),and accession number is of SRP125298.Blast X tool was used to search the unigenes against NR(NCBI non-redundant protein sequences),NT(NCBI nucleotide sequences),KEGG(Kyoto Encyclopedia of Genes and Genomes),SwissProt(Manually Annotated and Reviewed Protein Sequence Database),Pfam(Protein Family),GO(Gene Ontology) and KOG(en Karyotic Ortholog Groups) databases to perform gene function and pathway annotation.In this study,40 489 unigenes were de novo assembled.19 955 unigenes were annotated in the public protein databases.A total of 17 437,4 158,6 817,12 129,15 890,8 641,11 416 unigenes had NR,NT,KEGG,Swiss-Prot,Pfam,GO,KOG database classification.The microsatellite(SSR) analysis,which found a total of 2 992 SSR exist in 2 692 unigenes,the average length of 14.61 bp.The most abundant type of repeat motif was mononucleotide(65.51%) and trinucleotide(21.76%).The most frequent motifs in mononucleotide and trinucleotide were A/T(94.95%) and TTA/TAA(8.45%).A total of 115 repetitive primers were found,of which 12,60 and 38 of the dinucleotide,trinucleotide and tetranucleotide were found,respectively.The repetition frequency of the motif was between 5 and 50 times,and the length of the motif was mainly in the range of 10 ~ 20 bp.This study provides basic data for species identification and genetic diversity analysis through the acquisition of the primary data of the transcriptome of the

关 键 词:大猿叶甲 转录组测序 SSR分子标记 

分 类 号:S186[农业科学—农业基础科学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象