随机森林算法在产后抑郁风险预测中的应用  被引量:21

Risk prediction for postpartum depression based on random forest

在线阅读下载全文

作  者:肖美丽 晏春丽[2] 付冰[3] 杨淑平[4] 朱姝娟[3] 杨东琪 雷倍美 黄瑞瑞 雷俊[1,3] XIAO Meili;YAN Chunli;FU Bing;YANG Shuping;ZHU Shujuan;YANG Dongqi;LEI Beimei;HUANG Ruirui;LEI Jun(Xiangya Nursing School,Central South University,Changsha 410013;Department of Oncology,Third Xiangya Hospital,Central South University,Changsha 410013;Department of Obstetrics and Gynaecology,Third Xiangya Hospital,Central South University,Changsha 410013;School of Mathematics and Statistics,Central South University,Changsha 410013;Department of Gynaecology,Henan Provincial People’s Hospital,Zhengzhou 450000;Department of Otolaryngology,Xiangya Hospital,Central South University;School of Nursing,Hunan University of Medicine,Huaihua Hunan 418000,China)

机构地区:[1]中南大学湘雅护理学院,长沙410013 [2]中南大学湘雅三医院肿瘤科,长沙410013 [3]中南大学湘雅三医院妇产科,长沙410013 [4]中南大学数学与统计学院,长沙410083 [5]河南省人民医院妇科,郑州450000 [6]中南大学湘雅医院耳鼻喉科,长沙410008 [7]湖南省医药学院护理学院,湖南怀化418000

出  处:《中南大学学报(医学版)》2020年第10期1215-1222,共8页Journal of Central South University :Medical Science

基  金:国家自然科学基金(81874267);湖南省重点研发计划项目(2018SK2068);中南大学湘雅三医院“新湘雅”人才工程项目(20170305)。

摘  要:目的:探讨随机森林算法在产后抑郁影响因素的筛选和风险预测中的应用效果。方法:选取2017年6月至2018年6月在湖南省长沙市某三甲医院接受产前检查并在该医院分娩,符合纳入和排除标准的孕早期妇女为研究对象。入组时,使用自编的调查问卷、中文版爱丁堡产后抑郁量表(Edinburgh Postnatal Depression Scale,EPDS)调查研究对象的人口经济学、心理社会学、生物学和产科及其他特征;产后4~6周内,采用中文版EPDS进行抑郁评分和自编的产后资料问卷收集分娩和产后资料。采用R软件在训练数据集上建立产后抑郁风险预测的随机森林模型,在验证数据集上采用预测准确率、灵敏度、特异度、阳性预测值、阴性预测值和曲线下面积(area under curve,AUC)评价模型的预测效果。结果:共调查406例研究对象,其中150例的EPDS得分≥9,产后抑郁的发生率为36.9%。通过随机森林算法对训练集建立的模型在验证集上验证,得出预测准确度为80.10%,灵敏度为61.40%,特异度为89.10%,阳性预测值为73.00%,阴性预测值为82.80%,AUC值0.833。采用随机森林算法通过变量重要性评分对产后抑郁影响因素的重要程度进行排序,得出排名前10位的重要预测变量为产前抑郁、产后经济担忧程度、产后工作担忧程度、孕早期血清游离三碘甲腺原氨酸、孕晚期高密度脂蛋白、向婴幼儿发脾气、孕早期血清总胆固醇、孕早期三酰甘油、孕晚期血细胞比容和三酰甘油。结论:随机森林算法在产后抑郁的风险预测中具有较大优势,通过综合评价机制能从复杂的多因素中识别出产后抑郁的重要影响因素,并进行定量分析。这对识别产后抑郁关键因素,进行及时、有效干预具有重要意义。Objective: To explore the application of random forest algorithm in screening the risk factors and predictive values for postpartum depression.Methods: We recruited the participants from a tertiary hospital between June 2017 and June 2018 in Changsha City, and followed up from pregnancy up to 4-6 weeks postpartum.Demographic economics, psychosocial, biological, obstetric, and other factors were assessed at first trimesters with self-designed obstetric information questionnaire and the Chinese version of Edinburgh Postnatal Depression Scale(EPDS). During 4-6 weeks after delivery, the Chinese version of EPDS was used to score depression and self-designed questionnaire to collect data of delivery and postpartum. The data of subjects were randomly divided into the training data set and the verification data set according to the ratio of 3 ? 1. The training data set was used to establish the random forest model of postpartum depression, and the verification data set was used to verify the predictive effects via the accuracy, sensitivity, specificity, positive predictive value, negative predictive value,and AUC index.Results: A total of 406 participants were in final analysis. Among them, 150 of whom had EPDS score ≥9, and the incidence of postpartum depression was 36.9%. The predictive effects of random forest model in the verification data set were at accuracy of 80.10%,sensitivity of 61.40%, specificity of 89.10%, positive predictive value of 73.00%, negative predictive value of 82.80%, and AUC index of 0.833. The top 10 predictive influential factors that screening by the variable importance measure in random forest model was antenatal depression, economic worries after delivery, work worries after delivery, free triiodothyronine in first trimesters, high-density lipoprotein in third trimester, venting temper to infants, total serum cholesterol and serum triglyceride in first trimester,hematocrit and serum triglyceride in third trimester.Conclusion: Random forest has a great advantage in risk prediction for postp

关 键 词:随机森林 产后抑郁 影响因素 风险预测 

分 类 号:R714.6[医药卫生—妇产科学] R749.4[医药卫生—临床医学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象