High-quality Arabidopsis thaliana Genome Assembly with Nanopore and HiFi Long Reads  被引量:12

在线阅读下载全文

作  者:Bo Wang Xiaofei Yang Yanyan Jia Yu Xu Peng Jia Ningxin Dang Songbo Wang Tun Xu Xixi Zhao Shenghan Gao Quanbin Dong Kai Ye 

机构地区:[1]MOE Key Laboratory for Intelligent Networks&Network Security,Faculty of Electronic and Information Engineering,Xi’an Jiaotong University,Xi’an 710049,China [2]School of Computer Science and Technology,Faculty of Electronic and Information Engineering,Xi’an Jiaotong University,Xi’an 710049,China [3]School of Life Science and Technology,Xi’an Jiaotong University,Xi’an 710049,China [4]School of Automation Science and Engineering,Faculty of Electronic and Information Engineering,Xi’an Jiaotong University,Xi’an 710049,China [5]Genome Institute,the First Affiliated Hospital of Xi’an Jiaotong University,Xi’an 710061,China

出  处:《Genomics, Proteomics & Bioinformatics》2022年第1期4-13,共10页基因组蛋白质组与生物信息学报(英文版)

基  金:supported by the National Natural Science Foundation of China(Grant Nos.62172325 and 32070663);the China Postdoctoral Science Foundation(Grant No.2020M673420);the Fundamental Research Funds for the Central Universities,China;the World-Class Universities(Disciplines);the Characteristic Development Guidance Funds for the Central Universities,China。

摘  要:Arabidopsis thaliana is an important and long-established model species for plant molecular biology,genetics,epigenetics,and genomics.However,the latest version of reference genome still contains a significant number of missing segments.Here,we reported a high-quality and almost complete Col-0 genome assembly with two gaps(named Col-XJTU)by combining the Oxford Nanopore Technologies ultra-long reads,Pacific Biosciences high-fidelity long reads,and Hi-C data.The total genome assembly size is 133,725,193 bp,introducing 14.6 Mb of novel sequences compared to the TAIR10.1 reference genome.All five chromosomes of the Col-XJTU assembly are highly accurate with consensus quality(QV)scores>60(ranging from 62 to 68),which are higher than those of the TAIR10.1 reference(ranging from 45 to 52).We completely resolved chromosome(Chr)3 and Chr5 in a telomere-to-telomere manner.Chr4 was completely resolved except the nucleolar organizing regions,which comprise long repetitive DNA fragments.The Chrl centromere(CEN1),reportedly around 9 Mb in length,is particularly challenging to assemble due to the presence of tens of thousands of CEN180 satellite repeats.Using the cutting-edge sequencing data and novel computational approaches,we assembled a 3.8-Mb-long CEN1 and a 3.5-Mb-long CEN2.We also investigated the structure and epigenetics of centromeres.Four clusters of CEN180 monomers were detected,and the centromere-specific histone H3-like protein(CENH3)exhibited a strong preference for CEN180 Cluster 3.Moreover,we observed hypomethylation patterns in CENH3-enriched regions.We believe that this high-quality genome assembly,Col-XJTU,would serve as a valuable reference to better understand the global pattern of centromeric polymorphisms,as well as the genetic and epigenetic features in plants.

关 键 词:Centromere architecture CENH3 Bacterial artificial chromosome Telomere-to-telomere Model plant 

分 类 号:Q943.2[生物学—植物学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象