检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:任毅鹏[1] 张佳庆[1] 孙瑜[1] 吴振峰[2] 阮吉寿[2] 贺秉军[1] 刘国卿[1] 高山[1] 卜文俊[1]
机构地区:[1]南开大学生命科学学院,天津300071 [2]南开大学数学科学学院,天津300071
出 处:《科学通报》2016年第11期1250-1254,共5页Chinese Science Bulletin
基 金:南开大学2015年研究生科研创新计划;国家自然科学基金(31371974;31201738)资助
摘 要:当前,绝大多数的转录组数据都是基于以Illumina平台为代表的第二代高通量测序技术获得的,但是第二代测序技术无法提供大量的长转录本并且丢失可变剪接等重要信息,因而大大制约了转录组数据的深度利用.通过以PacBio为代表的第三代测序技术,可以获得更长乃至全长转录组,但由于Pac Bio转录组测序近几年才刚兴起,只有少量的物种基于PacBio平台获得了转录组.PacBio全长转录组测序,在国际上才刚开展但发展很快,其实验与数据分析标准和质量控制方面的研究对于未来的大规模应用至关重要.本研究在国际上首次尝试依据PacBio平台最新试剂(P6/C4)优化实验参数,设计质量控制指标并使全长转录组测序标准化.本文基于一组昆虫(麻皮椿)全长转录组数据,对已取得的部分结果进行报告.The Next Generation Sequencing(NGS) technology, particularly the Illumina platform now has produced most of the animal and plant transcriptomes, but the short reads from NGS sequencers result in incompletely assembled transcripts which are lack of some important information(e.g. alternative splicing). This limits better understanding of transcriptome data. Based on the single-molecule real-time(SMRT) sequencing technology, the Pac Bio platform can provide longer and even full-length transcripts that originate from observations of single molecules without assembly. The full-length transcripts can be used to investigate alternative splicing, alternative polyadenylation, novel genes, non-coding RNAs and fusion transcripts, et al. Until the end of 2015, transcriptomes of a few species have been sequenced using the Pac Bio platform. They are classfied into three groups. The first group includes human lymphoblastoid and Salvia miltiorrhiza using a combination of NGS short reads and SMRT technology. The second group includes HIV-1, bovine immunoglobulin G, human embryonic stem cells, mouse neurexins and Propithecus coquereli using SMRT. The third group includes european cuttlefish, tetraploid cotton and fungi using SMRT with the latest Pac Bio full-length transcriptome data analysis pipeline Iso Seq. The use of SMARTer PCR c DNA Synthesis Kit and the Iso Seq data analysis pipeline was recommended to facilitate full-length transcriptome sequencing. However, the transcriptome data quality could be affected by ribosomal RNA contamination, cross-contamination on agarose gel, the effect of size selection using gel or Blue Pippin, prevalence of PCR chimera products and the wrong removal of SMRT bell adapters. Although Iso Seq can remove artificial concatemers that are produced due to insufficient SMRT bell amount during the sequencing library preparation step, some problems still exists. For example, Iso Seq can not distinguish PCR chimeras from true fusion genes. Another critical problem is the misidentification
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15