急性呼吸窘迫综合征患者死亡率预测的两阶段堆叠异构集成模型  

A two-stage stacked heterogeneous ensemble model for predicting the mortality rate of patients with acute respiratory distress syndrome

在线阅读下载全文

作  者:张文正 孔平[1,2] 宋燕 周亮[2] 陈立范 ZHANG Wenzheng;KONG Ping;SONG Yan;ZHOU Liang;CHEN Lifan(School of Health Science and Engineering,University of Shanghai for Science and Technology,Shanghai 200093;Collaborative Innovation Center for Biomedicine,Shanghai University of Medicine&Health Sciences,Shanghai 200237;School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093)

机构地区:[1]上海理工大学健康科学与工程学院,上海200093 [2]上海健康医学院协同科研中心,上海200237 [3]上海理工大学光电信息与计算机学院,上海200093

出  处:《北京生物医学工程》2024年第3期259-266,共8页Beijing Biomedical Engineering

摘  要:目的建立一个机器学习模型能够准确预测急性呼吸窘迫综合征(acute respiratory distress syndrome,ARDS)患者死亡风险,选取合适的填充方式解决现有电子健康记录(electronic health record,EHR)中存在的稀疏性、不规则性问题,辅助医生进行临床决策。方法从重症监护医学信息数据库(medical information mart for intensive care,MIMIC-Ⅲ)中筛选符合“柏林定义”的ARDS患者,并对患者入院24 h内的生命体征、实验室指标、诊断代码、影像学报告等数据进行回顾性分析,首先使用非负潜在因子分解填补缺失值,然后构建两阶段的堆叠异构集成学习方法,预测患者30 d内的死亡风险,采用受试者工作特征曲线下面积(area under the receiver operation characteristic curve,AUROC)、准确度、精确度、F1值等指标对模型进行评价,并进行特征重要性分析。结果本研究共纳入2576个患者,80%用于训练,20%用于模型测试。利用不同填充方式对数据进行处理,非负潜在因子分解相较于其他填充方式能够更好地保留原数据的分布结构,有着更高的填充精度。对填充好的数据进行建模,两阶段堆叠集成模型的准确度为0.841,AUROC为0.846,F1值为0.586,相较于其他机器学习模型展示出了更好的预测能力。结论两阶段的堆叠异构集成学习模型能够较好地实现对ARDS患者死亡风险预测。Objective To establish a machine learning model to accurately predict the risk of death in patients with acute respiratory distress syndrome(ARDS),and select an appropriate filling method to solve the problem of existing electronic health record(EHR).The sparsity and irregularity problems existing in EHR can assist doctors to make clinical decisions.Methods Patients with ARDS who met the Berlin definition were screened from the medical information mart for intensive care database(MIMIC-Ⅲ).The vital signs,laboratory indicators,diagnostic codes,imaging reports and other data within 24 hours of admission were retrospectively analyzed.First,non-negative latent factorization was used to fill in missing values,and then a two-stage stacked heterogeneous ensemble learning method was constructed to predict the mortality risk of patients within 30 days.The area under the receiver operation characteristic curve(AUROC),accuracy,precision,F1 score and other indicators were used to evaluate the model,and the importance of features was analyzed.Results This study included a total of 2576 patients,with 80%used for training and 20%for model testing.Employing various imputation methods for data preprocessing,non-negative matrix factorization exhibited a superior ability compared to other imputation methods in preserving the original data's distributional structure,resulting in higher imputation accuracy.Upon modeling the imputed data,the two-stage stacked ensemble model achieved an accuracy of 0.841,an AUROC of 0.846,and an F1 score of 0.586.These values demonstrate a better predictive capability compared to other machine learning models.Conclusions The two-stage stacked heterogeneous ensemble learning model can effectively predict the mortality risk of ARDS patients.

关 键 词:急性呼吸窘迫综合征 机器学习 两阶段法 死亡率 

分 类 号:R318.04[医药卫生—生物医学工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象