基于机器学习构建脂肪肝预测模型  

Constructing a fatty liver prediction model based on machine learning

作  者:宋莹莹 张展召 Song Yingying;Zhang Zhanzhao(Xinxiang Medical University,Xinxiang 453003,China;Sanquan College of Xinxiang Medical University,Xinxiang 453003,China)

机构地区:[1]新乡医学院,新乡453003 [2]新乡医学院三全学院,新乡453003

出  处:《现代仪器与医疗》2025年第1期43-50,共8页Modern Instruments & Medical Treatment

基  金:河南省2022年省级专创融合特色示范课课程(195)。

摘  要:目的本研究旨在利用机器学习技术构建一种高效的脂肪肝预测模型,以帮助医务人员准确识别和分类脂肪肝高危人群,从而为早期干预和个性化治疗提供支持。方法研究数据来自2017—2020年的美国国家和健康调查(National Health and Nutrition Examination Survey,NHANES)数据库。最终纳入11个变量进行分析。采用6种机器学习算法决策树(Decision Tree,DT)、极端梯度提算(Extreme Gradient Boosting,XGBoost)、自适应提升算法(Adaptive Boosting,Adaboost)、K最邻近(K-Nearest Neighbor,KNN)、Logistic回归、人工神经网络(Artificial neural network,ANN)模型构建脂肪肝预测模型,采用随机生成的70%作为训练集,剩下的30%作为测试集。模型评估指标包括精确率、准确率、召回率、F1值、受试者工作特征曲线(Receiver Operating Characteristic curve,ROC)及曲线下面积(Area Under Curve,AUC)。结果共纳入3383名受试者,其中脂肪肝患者1506例,DT、XGBoost、AdaBoost、KNN、Logistic回归、ANN模型的AUC分别为0.98838、0.94624、0.88935、0.99110、0.90047、0.98990。除了KNN模型外,其余所有模型的校准曲线均表现良好,其中AdaBoost模型在六个模型中展现出最佳的预测性能。结论AdaBoost模型被认为是最优的预测模型,适用于脂肪肝的预测,并能够解释特征变量之间的交互关系。体质指数、甘油三酯和低密度脂蛋白指标能够有效识别脂肪肝的高危人群。Objective This study aims to use machine learning technology to construct an efficient fatty liver prediction model to help medical personnel accurately identify and classify high-risk populations for fatty liver,thereby providing support for early intervention and personalized treatment.Methods The research data was sourced from the National Health and Nutrition Examination Survey(NHANES)database in the United States from 2017 to 2020.Finally,11 variables were included for analysis.Using 6 machine learning algorithms,Decision Tree(DT)Extreme Gradient Boosting(XGBoost),Adaptive Boosting(Adaboost),K-Nearest Neighbor(KNN),Logistic Regression,and Artificial Neural Network(ANN)models are used to construct a fatty liver prediction model.The randomly generated 70%is used as the training set,and the remaining 30%is used as the testing set.The model evaluation metrics include accuracy,precision,recall,F1 score,Receiver Operating Characteristic curve(ROC),and Area Under Curve(AUC).Results A total of 3383 participants were included,including 1506 patients with fatty liver.The decision tree The AUC of XGBoost,AdaBoost,KNN,logistic regression,and ANN models are 0.98838,0.94624,0.88935,0.99110,0.90047,and 0.98990,respectively.Except for the KNN model,the calibration curves of all other models performed well,with the AdaBoost model showing the best predictive performance among the six models.Conclusion The AdaBoost model is considered the optimal predictive model for predicting fatty liver and can explain the interaction between feature variables.Body mass index,triglycerides,and low-density lipoprotein indicators can effectively identify high-risk populations for fatty liver.

关 键 词:脂肪肝 机器学习 模型预测 自适应提升算法 受试者工作特征曲线1 

分 类 号:TH77[机械工程—仪器科学与技术] R575.5[机械工程—精密仪器及机械]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象