检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈晓彤 岑梓熹 谭静仪 栾雅 彭师师 严波 何震 CHEN Xiao-tong;CEN Zi-xi;TAN Jing-yi;LUAN Ya;PENG Shi-shi;YAN Bo;HE Zhen(School of Health,Guangzhou Xinhua University,Guangzhou 510310,Guangdong,China;School of Materials Engineering,Jiangsu University of Science and Technology,Zhenjiang 215699,Jiangsu,China)
机构地区:[1]广州新华学院健康学院,广东广州510310 [2]江苏科技大学材料工程学院,江苏镇江215699
出 处:《医学信息》2024年第11期11-15,共5页Journal of Medical Information
基 金:校级大学生创新创业项目(编号:202213902120);校级科研项目(编号:2020KYQN03)。
摘 要:目的 用机器学习三种不同算法建立心力衰竭分类预测模型,比较模型的准确率,并分析心力衰竭死亡事件重要性特征,对人群尽早发现和实施介入措施提供援助,努力提高人们的健康水平和生活质量。方法 使用Kaggle平台发布的心力衰竭数据集,通过缺失值填充法、数据标准化处理、SMOTE方法进行数据预处理。基于随机森林、C4.5、AdaBoost算法建立心力衰竭预测模型。使用性能评估指标混淆矩阵、ROC曲线、均方根误差以及均值绝对误差评估评价模型性能。结果 PermutationImportance给出的变量重要性排序中,血清肌酐水平、年龄、血清钠离水平排序靠前。三种模型中,随机森林模型准确率为85%,精确率为81%,召回率为68%;C4.5模型准确率为83%,精确率为80%,召回率为63%;AdaBoost模型准确率为80%,精确率为71%,召回率为63%。结论 基于所用数据集,随机森林模型优于C4.5模型与AdaBoost模型,心力衰竭死亡风险预测模型能为心力衰竭早期预防控制及诊断提供参考依据。Objective To establish a classification and prediction model of heart failure by using three different algorithms of machine learning,compare the accuracy of the model,and analyze the importance characteristics of heart failure death events,so as to provide assistance for the early detection and implementation of intervention measures,and strive to improve people's health level and quality of life.Methods Using the heart failure data set published by Kaggle platform,the data preprocessing was carried out by missing value filling method,data standardization processing and SMOTE method.A heart failure prediction model was established based on random forest,C4.5 and AdaBoost algorithms.The performance evaluation index confusion matrix,ROC curve,root mean square error and mean absolute error were used to evaluate the performance of the model.Results In the order of importance of variables given by PermutationImportance,serum creatinine level,age and serum sodium level were ranked first.Among the three models,the accuracy of the random forest model was 85%,the accuracy was 81%,and the recall rate was 68%;the accuracy rate of the C4.5 model was 83%,the accuracy rate was 80%,and the recall rate was 63%.The accuracy rate of AdaBoost model was 80%,the accuracy rate was 71%,and the recall rate was 63%.Conclusion Based on the data set used,the random forest model is superior to the C4.5 model and the AdaBoost model.The heart failure death risk prediction model can provide a reference for early prevention,control and diagnosis of heart failure.
关 键 词:心力衰竭 死亡 预测模型 C4.5 随机森林 ADABOOST
分 类 号:R541.6[医药卫生—心血管疾病]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49