基于高通量测序组装‘赤霞珠’叶绿体基因组及其特征分析  被引量:16

Assembling and Characteristic Analysis of the Complete Chloroplast Genome of Vitis vinifera cv. Cabernet Sauvignon from High-Throughput Sequencing Data

在线阅读下载全文

作  者:谢海坤 焦健[1] 樊秀彩[1] 张颖[1] 姜建福[1] 孙海生[1] 刘崇怀[1] 

机构地区:[1]中国农业科学院郑州果树研究所,郑州450009

出  处:《中国农业科学》2017年第9期1655-1665,共11页Scientia Agricultura Sinica

基  金:国家现代农业产业技术体系建设专项资金(CARS-30-yz-1);中国农业科学院科技创新工程专项(CAAS-ASTIP-2015-ZFRI);农业部物种保护项目(2130135-34)

摘  要:【目的】以欧亚种葡萄‘赤霞珠’(Cabernet Sauvignon)为试材,建立适于葡萄属(Vitis)植物完整叶绿体基因组组装及其特征分析的方法,为研究葡萄属植物的进化和系统发育提供方法指导。【方法】采用Illumina Hi Seq PE150双末端测序策略对其全基因组DNA建库测序,建库类型为350 bp DNA小片段文库,测序深度为10倍。以已发表的拟南芥(Arabidopsis thaliana)和欧亚种葡萄‘黑比诺’(Pinot Noir)的叶绿体基因组序列为参考,通过BLASTN比对提取葡萄叶绿体基因组序列,并用SOAPdenovo软件进行组装,得到‘赤霞珠’完整的叶绿体基因组并对其进行特征分析。【结果】基于高通量Illumina测序,共获得5.2 G的全基因组原始数据,其中,葡萄叶绿体基因组序列为0.42 G,约占全基因组序列的8%。用抽提出来的葡萄叶绿体基因组序列成功组装出‘赤霞珠’完整叶绿体基因组。特征分析表明,叶绿体基因组序列全长160 676 bp,包括大单拷贝区(large single copy,LSC)、小单拷贝区(small single copy,SSC)和2个反向重复序列(inverted repeat,IRA和IRB),长度分别为89 134、19 072和26 235 bp,具有典型被子植物叶绿体基因组环状四分体结构;共注释得到154个基因,包括99个蛋白编码基因、47个t RNA基因和8个r RNA基因;其叶绿体基因组的GC含量为37.43%;共检测到37个串联重复序列(tandem repeat sequence)和53个散在重复序列(dispersed repeats),其中,绝大部分串联重复序列的长度为11—42 bp,占叶绿体基因组序列的0.83%,而散在重复序列占叶绿体基因组序列的5.33%;此外,还检测到50个简单重复序列(simple sequence repeats,SSR)位点,大部分的SSRs均由A或T组成,同时SSRs在‘赤霞珠’叶绿体基因组上的分布是不均匀的,LSC区段含有39个SSRs,而SSC区段和IR区段分别仅有7个和4个SSRs;与蛋白编码基因对应的密码子偏好使用A/T碱基,并且编码亮氨酸(L)的密码子使用频率最高,而编码半[Objective] A method was built to assemble complete chloroplast (cp) genome of 18tis and analyze its characteristics with Vitis vinifera cv. Cabemet Sauvignon, which will provide a methodological guidance for evolution and phylogenetic analysis of Vitis in the future. [Method] Total genomic DNA was extracted from young leaves of Cabernet Sauvignon using plant genomic DNA kit. The small fragments (350 bp) of DNA libraries were constructed according to the manufacturer's manual for the Illumina HiSeq PE150, and the sequencing depth was l0 fold. Grape cp reads were extracted by BLASTN software according to cp genome sequence ofArabidopsis thaliana (NC000932) and Pinot Noir (DQ424856). SOAPdenovo 2.04 assembled the extracted cp reads into complete chloroplast genome of Cabernet Sauvignon. Then its basic characteristics were analyzed using some bioinformatic softwares. [ Result ] This research obtained total of 5.2 G raw data after high-throughput sequencing. Among them, 0.42 G clean data of grape cp reads were extracted, and it accounted for about 8%. These extracted grape cp reads assembled the complete cp genome successfully. The characteristic analysis of grape cp genome showed that it was a circular molecule of 160 676 bp in length with a typical quadripartite structure, including a pair of inverted repeats (IRA and IRB) of 26 235 bp that were separated by large and small single copy regions (LSC and SSC) of 89 134 bp and 19 072 bp, respectively. A total of 154 predicted genes, including 99 protein-coding genes, 47 tRNA genes and 8 rRNA genes were identified. And the GC content of cp genome was 37.43%. Furthermore, the cp genome of Cabernet Sauvignon contained 37 tandem repeat sequences and 53 dispersed repeats. The length of most tandem repeat sequences was 11-42 bp. They accounted for 0.83% of whole cp genome, and the dispersed repeats accounted for 5.33%. Additionally, fifty short simple repeats (SSRs) loci of cp genome were detected. And most SSR loci were composed of A or T cont

关 键 词:'赤霞珠’ 叶绿体基因组 高通量测序 特征分析 系统发育分析 

分 类 号:S663.1[农业科学—果树学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象