KoNA:Korean Nucleotide Archive as A New Data Repository for Nucleotide Sequence Data  

在线阅读下载全文

作  者:Gunhwan Ko Jae Ho Lee Young Mi Sim Wangho Song Byung-Ha Yoon Iksu Byeon Bang Hyuck Lee Sang-Ok Kim Jinhyuk Choi Insoo Jang Hyerin Kim Jin Ok Yang Kiwon Jang Sora Kim Jong-Hwan Kim Jongbum Jeon Jaeeun Jung Seungwoo Hwang Ji-Hwan Park Pan-Gyu Kim Seon-Young Kim Byungwook Lee 

机构地区:[1]Korea Bioinformation Center,Korea Research Institute of Bioscience&Biotechnology,Daejeon 34141,Republic of Korea

出  处:《Genomics, Proteomics & Bioinformatics》2024年第1期161-167,共7页基因组蛋白质组与生物信息学报(英文版)

基  金:supported by the Next-generation Genome-InfraNET for the advancement of genome research and service(Grant No.2019M3C9A5069653);the Construction of biological data station(Grant No.2020M3A9I6A01036057)grants from the National Research Foundation of Korea.

摘  要:During the last decade,the generation and accumulation of petabase-scale high-throughput sequencing data have resulted in great challenges,including access to human data,as well as transfer,storage,and sharing of enormous amounts of data.To promote data-driven biological research,the Korean government announced that all biological data generated from government-funded research projects should be deposited at the Korea BioData Station(K-BDS),which consists of multiple databases for individual data types.Here,we introduce the Korean Nucleotide Archive(KoNA),a repository of nucleotide sequence data.As of July 2022,the Korean Read Archive in KoNA has collected over 477 TB of raw next-generation sequencing data from national genome projects.To ensure data quality and prepare for international alignment,a standard operating procedure was adopted,which is similar to that of the International Nucleotide Sequence Database Collaboration.The standard operating procedure includes quality control processes for submitted data and metadata using an automated pipeline,followed by manual examination.To ensure fast and stable data transfer,a high-speed transmission system called GBox is used in KoNA.Furthermore,the data uploaded to or downloaded from KoNA through GBox can be readily processed using a cloud computing service called Bio-Express.This seamless coupling of KoNA,GBox,and Bio-Express enhances the data experience,including submission,access,and analysis of raw nucleotide sequences.KoNA not only satisfies the unmet needs for a national sequence repository in Korea but also provides datasets to researchers globally and contributes to advances in genomics.The KoNA is available at https://www.kobic.re.kr/kona/.

关 键 词:Korea BioData Station Nucleotide sequence Next-generation sequencing repository GENOMICS Deposition and access of big data 

分 类 号:Q811.4[生物学—生物工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象