检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:许喆[1] 程丝 刘阳[1] 石延枫 李昊[1] XU Zhe;CHENG Si;LIU Yang;SHI Yanfeng;LI Hao(Department of Neurology,Beijing Tiantan Hospital,Capital Medical University,Center of Excellence for Omics Research,China National Clinical Research Center for Neurological Diseases,Beijing 100070,China)
机构地区:[1]首都医科大学附属北京天坛医院神经病学中心,国家神经系统疾病临床医学研究中心卒中多组学创新中心,北京100070
出 处:《中国卒中杂志》2022年第3期216-226,共11页Chinese Journal of Stroke
基 金:2020年度首都卫生发展科研专项项目(首发2020-1-2041)。
摘 要:目的建立并优化适合于脑血管病基因组学数据分析的生物信息流程,促进脑血管病多组学和精准医学研究的开展。方法调研和梳理临床科研需求,参考脑血管病以及群体遗传领域基因组学、遗传学研究,总结常用分析方法。按照研究目标和分析内容的不同,对生物信息学流程进行模块化设计。依托中国国家卒中登记Ⅲ(China national stroke registry-Ⅲ,CNSR-Ⅲ)研究产生的基因组学数据,在高性能运算集群(浮点运算能力375万亿次/秒)进行分析流程的搭建、测试和优化。结果本研究搭建的生物信息学分析流程,包括数据质控、关联分析、连锁分析、遗传变异注释、跨组学分析等多个模块。通过使用相应模块对上万例CNSR-Ⅲ样本的基因组学数据进行质控和分析,最终确认10241例数据质控合格、无3度以内亲缘关系的全基因组测序样本用于全基因组关联分析。结论结合脑血管病的特点,优化生物信息学分析流程,可以为脑血管病多组学研究提供数据保障,提升研究效率,为脑血管病风险评估、诊断与个体化治疗提供依据。Objective To construct an optimized bioinformatics analysis pipeline that was suitable for genomics researches in cerebrovascular diseases(CVD),and promote CVD multi-omics and precision medicine studies.Methods Clinical research needs and commonly used analysis methods from genomic and genetic studies in cerebrovascular diseases and population genetics were summarized.Modularized design was applied in the bioinformatics analysis pipeline according to the different research objectives and data.Based on the genomics data from China national stroke registry-Ⅲ(CNSR-Ⅲ)and highperformance computing cluster(floating point operation capacity of 375 trillion times/SEC),the pipeline was constructed,tested,and optimized.Results The bioinformatics analysis pipeline in this study included several modules,such as data quality control,association analysis,linkage analysis,genetic variation annotation,and multi-omics analysis.These modules were used to analyze the genomics data from CNSR-Ⅲ.A total of 10241 whole genome sequenced samples passed the filter for data quality and familial relationships of 3rdor higher-order-degrees.These samples would be applied in genome-wide association studies.Conclusions Optimization of the bioinformatics analysis pipeline for CVD genomics researches can improve the study efficiency,support further multi-omics research,and provide basis for CVD risk assessment,diagnosis,and personalized treatment.
关 键 词:脑血管病 基因组学 遗传学 生物信息学分析 大数据
分 类 号:R743[医药卫生—神经病学与精神病学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222