检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:戴品远 余小金[1] 谢纬华 赵超 刘冉 尹立红 陈炳为[1] Dai Pinyuan;Yu Xiaojin;Xie Weihua(Department of Epidemiology and Health Statistics,School of Public Health,Southeast University(210009),Nanjing)
机构地区:[1]东南大学公共卫生学院流行病与卫生统计学系,210009 [2]温州医科大学附属第二医院、育英儿童医院质量管理处 [3]东南大学公共卫生学院环境医学工程教育部重点实验室
出 处:《中国卫生统计》2021年第5期656-660,共5页Chinese Journal of Health Statistics
基 金:国家自然科学基金项目(81673274,81872588)。
摘 要:目的探讨条件高斯贝叶斯网络(conditional Gaussian Bayesian network, CGBN)在代谢组学数据的分类判别中的应用。方法通过模拟研究与实际代谢组学数据分析,比较CGBN与偏最小二乘判别分析(partial least squares discriminant analysis, PLSDA)在不同相关程度和不同稀疏水平的高维数据及线性相关与非线性等情形时的分类判别性能,评价指标采用ROC曲线下面积(area under curve, AUC)和平均计算时间。结果模拟研究结果表明,变量之间低相关且样本量不大于200时CGBN分类判别AUC高于PLSDA。在自变量与因变量非线性相关且小样本情况下CGBN分类判别AUC同样高于PLSDA。实例数据分析结果显示CGBN和PLSDA分类判别的AUC分别为0.997,0.975。CGBN的计算时间要远高于PLSDA。结论在不受计算负担限制的情形下,CGBN是代谢组学数据典型分析方法的一种可行的替代方法,值得进一步研究。Objective To explore the application of the conditional Gaussian Bayesian network(CGBN) in the classification of metabolomics data.Methods Through simulation study and actual metabolomics data analysis, the classification performance of CGBN and partial least squares discriminant Analysis(PLSDA) in high-dimensional data with different correlation and sparse levels, linear or non-linear correlation were compared.The area under the ROC curve(AUC) and the average calculation time were used to evaluate the methods.Results The simulation results showed that when the correlation between variables is low and the sample size was not more than 200,AUC of CGBN is higher than PLSDA.AUC of CGBN was also higher than PLSDA in the case of non-linear correlation between independent variables and dependent variables with small samples.The analysis results of actual data showed that the AUC of CGBN and PLSDA were 0.997,0.975.CGBN spent more time building models than PLSDA.Conclusion Without the limitation of computational burden, CGBN may be a feasible alternative to the typical analysis of metabolomics data.
分 类 号:R195.1[医药卫生—卫生统计学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.30