围绝经期女性抑郁风险预测模型的构建  

Construction of a prediction model for depression risk in perimenopausal women

在线阅读下载全文

作  者:王登芹[1] 宋佩博 李万斌 谢京蕊 Wang Dengqin;Song Peibo;Li Wanbin;Xie Jingrui(Medica Comprehensive Training Center,Jining Medical University,Jining 272067,China;School of Mathematics,Shandong University,Jinan 250100,China;Department of Clinical Medicine,Jining Medical University,Jining 272067,China;Department of Gynecology,China Academy of Chinese Medical Sciences Xiyuan Hospital Jining Hospital,Jining 272000,China)

机构地区:[1]济宁医学院医学综合实训中心,济宁272067 [2]山东大学数学学院,济南250100 [3]济宁医学院临床医学院,济宁272067 [4]中国中医科学院西苑医院济宁医院,济宁272000

出  处:《中华行为医学与脑科学杂志》2025年第2期151-157,共7页Chinese Journal of Behavioral Medicine and Brain Science

基  金:山东省社科普及应用研究项目(2021-SKZZ-119);山东省中医药科技发展项目(2019-0461)。

摘  要:目的构建基于机器学习的围绝经期抑郁症状风险预测模型,筛选围绝经期抑郁的危险因素。方法从中国健康与养老追踪调查(CHARLS)2020年数据中选取1105名45~55岁女性作为研究对象,使用随机森林、XGBoost和AdaBoost 3种机器学习算法构建围绝经期抑郁症状风险预测模型。采用SPSS 24.0进行数据描述统计和组间比较。采用Python3.10软件构建风险预测模型,使用受试者操作特征(ROC)曲线及校准曲线综合评估模型性能,筛选出预测性能最佳的模型,并使用Shapley加性解释(SHAP)算法分析特征重要性及特征对预测结果的影响。结果1105名围绝经期女性中,非抑郁症状组671例(60.7%),抑郁症状组434例(39.3%)。3个机器学习模型中,随机森林模型整体性能最佳,其ROC下面积(AUC)为0.793,校准度为0.181。SHAP分析显示,家庭年总收入为随机森林模型中最大风险因素,其相对重要性为0.048,其后依次为认知功能(0.047)、自评健康状况(0.046)、生活满意度(0.043)和睡眠时长(0.041)等。结论基于随机森林算法构建模型可有效预测围绝经期抑郁症状发生风险。家庭年总收入、认知功能、自评健康状况、生活满意度等指标是围绝经期女性抑郁症状的危险因素。Objective To establish a machine learning-based risk prediction model for perimenopausal depressive symptoms and to identify associated risk factors.MethodsA total of 1105 women aged 45 to 55 years were selected from the 2020 China Health and Retirement Longitudinal Study(CHARLS)dataset.Three machine learning algorithms,including Random Forest,XGBoost and Adaptive Boosting(AdaBoost),were employed to construct prediction models for perimenopausal depressive symptoms.Descriptive statistics and between-group comparisons were performed using SPSS 24.0.And Python 3.10 software was used to build the risk prediction model.Model performance was assessed using receiver operating characteristic(ROC)curves and calibration plots,and the optimal model was identified accordingly.The Shapley additive explanation(SHAP)algorithm was then used to analyze feature importance and the influence of each predictor on the outcome.ResultsAmong the 1105 perimenopausal women,671(60.7%)were categorized in the non-depressive group and 434(39.3%)in the depressive group.The Random Forest model demonstrated the best overall predictive performance among the three machine learning models,achieving an area under the ROC curve(AUC)of 0.793 and a calibration error of 0.181.SHAP analysis revealed that annual household income was the strongest risk factor in the Random Forest model,with a relative importance of 0.048,followed by cognitive function(0.047),self-rated health status(0.046),life satisfaction(0.043),sleep duration(0.041).ConclusionsThe Random Forest based model effectively predicts the risk of perimenopausal depressive symptoms.Annual household income,cognitive function,self-rated health,and life satisfaction are risk factors for depressive symptoms in perimenopausal women.

关 键 词:围绝经期 机器学习 预测模型 抑郁 随机森林模型 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象