检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:谢宁 毕文健 张中文 邵方 魏永越 赵杨 张汝阳[5] 陈峰 Xie Ning;Bi Wenjian;Zhang Zhongwen;Shao Fang;Wei Yongyue;Zhao Yang;Zhang Ruyang;Chen Feng(Department of Biostatistics,School of Public Health,Nanjing Medical University,Nanjing 211166,China;Department of Medical Genetics,School of Basic Medical Sciences,Peking University,Beijing 100191,China;Peking University Center for Public Health and Epidemic Preparedness&Response,Beijing 100191,China;China International Cooperation Center for Environment and Human Health,Nanjing Medical University,Nanjing 211166,China;Information Center,The Affiliated Changzhou Second People's Hospital of Nanjing Medical University,Changzhou 213164,China)
机构地区:[1]南京医科大学公共卫生学院生物统计学系,南京211166 [2]北京大学基础医学院医学遗传学系,北京100191 [3]北京大学公众健康与重大疫情防控战略研究中心,北京100191 [4]南京医科大学环境与人类健康国际联合研究中心,南京211166 [5]南京医科大学附属常州第二人民医院信息科,常州213164
出 处:《中华流行病学杂志》2025年第1期147-153,共7页Chinese Journal of Epidemiology
基 金:国家自然科学基金(82220108002,82273737)。
摘 要:极端不平衡数据定义为自变量或因变量指标的取值呈现严重比例失衡的数据,在此情境下,参数模型假设检验的经典统计量明显偏离大样本下的理论分布,导致第一类错误膨胀。超大型人群队列全基因组资源的日益共享使得高效准确处理极端不平衡数据的统计需求日益突出,也推动了遗传统计方法的发展。本文介绍当前全基因组关联研究中2种常用处理极端不平衡数据的校正方法:Firth校正方法和鞍点近似方法,并通过模拟实验展示其可有效控制第一类错误,最后,简单介绍极端不平衡基因组学数据常用分析软件。本文为研究者对极端不平衡数据的统计分析提供理论参考和应用推荐。Extremely unbalanced data refers to datasets with independent or dependent variables showing severe imbalances in proportions,which might lead to deviation of classical test statistics from theoretical distribution and difficulties in controlling typeⅠerror.The increased availability of genome-wide resources from large population cohorts has highlighted the growing demand for efficient and accurate statistical methods for the process of extremely unbalanced data to improve the development of genetic statistical methods.This paper introduces two widely used correction methods in current genome-wide association study for extremely unbalanced data,i.e.Firth correction and saddle point approximation,describes their effectiveness in controlling typeⅠerrors confirmed by simulation experiments,finally,and summarizes the commonly used software for extremely unbalanced genomic data to provide theoretical reference and suggestion for its application for the statistical analysis on extremely unbalanced data in future.
关 键 词:全基因组关联研究 极端不平衡数据 Firth校正 鞍点近似 罕见变异
分 类 号:R195.1[医药卫生—卫生统计学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.44