DNA palette code for time-series archival data storage  

在线阅读下载全文

作  者:Zihui Yan Haoran Zhang Boyuan Lu Tong Han Xiaoguang Tong Yingjin Yuan 闫子慧;张皓然;卢博源;韩彤;佟小光;元英进(Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering(Ministry of Education),School of Chemical Engineering and Technology,Tianjin University,Tianjin 300072,China;Frontiers Research Institute for Synthetic Biology,Tianjin University,Tianjin 300072,China;Department of Neurosurgery,Huanhu Hospital,Tianjin 300350,China)

机构地区:[1]Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering(Ministry of Education),School of Chemical Engineering and Technology,Tianjin University,Tianjin 300072,China [2]Frontiers Research Institute for Synthetic Biology,Tianjin University,Tianjin 300072,China [3]Department of Neurosurgery,Huanhu Hospital,Tianjin 300350,China

出  处:《National Science Review》2025年第1期288-296,共9页国家科学评论(英文版)

基  金:supported by the National Key R&D Program of China(2023YFA0913800);the National Natural Science Foundation of China(21621004).

摘  要:The long-term preservation of large volumes of infrequently accessed cold data poses challenges to the storage community.Deoxyribonucleic acid(DNA)is considered a promising solution due to its inherent physical stability and significant storage density.The information density and decoding sequence coverage are two important metrics that influence the efficiency of DNA data storage.In this study,we propose a novel coding scheme called the DNA palette code,which is suitable for cold data,especially time-series archival datasets.These datasets are not frequently accessed,but require reliable long-term storage for retrospective research.The DNA palette code employs unordered combinations of index-free oligonucleotides to represent binary information.It can achieve high net information density encoding and lossless decoding with low sequencing coverage.When sequencing reads are corrupted,it can sti l l effectively recover partial information,preventing the complete fai lure of file retrieval.The in vitro testing of clinical brain magnetic resonance imaging(MRI)data storage,as well as simulation validations using large-scale public MRI datasets(10 GB),planetary science datasets and meteorological datasets,demonstrates the advantages of our coding scheme,including high net information density,low decoding sequence coverage and wide applicability.

关 键 词:DNA data storage synthetic biology medical imaging error-correcting codes 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象