出 处:《安徽医药》2025年第4期747-753,I0003,I0004,共9页Anhui Medical and Pharmaceutical Journal
基 金:河北省科技厅医学科学研究重点计划项目(182777156)。
摘 要:目的探讨影响新型冠状病毒感染(novel coronavirus pneumonia;别名corona virus disease 2019,COVID-19)重症病人预后的危险因素,建立预测模型并进行验证,进而准确地评估COVID-19重症病人的不良预后。方法收集2022年11月1日至2023年7月1日沧州市中心医院收治的526例COVID-19重症病人的临床指标与结局(院内28 d内死亡或存活)。用于R软件“caret”包,将526例病人按7∶3的比例拆分为两组:训练集(n=369)用于模型训练,测试集(n=157)用于模型验证。利用极端梯度提升(XGBoost)、随机森林(RF)2种机器学习算法构建病人临床结局的预测模型,应用SHAP进行XGBoost模型可解释性分析,分别得出影响病人预后的变量。将RF和XGBoost得出的变量取交集得到差异有统计学意义的变量,进而构建决策树模型。最后,在训练集和测试集上利用受试者操作特征曲线(ROC曲线)、曲线下面积(AUC)评估所决策树模型的预测性能。结果通过XGBoost模型得到与院内死亡相关的变量15个,随机森林模型得到与院内死亡相关的变量23个,两种模型取交集得到13个与院内死亡相关性最强的重要变量(白细胞介素-6、N端脑钠肽前体、白蛋白、超敏肌钙蛋白I、淋巴细胞、血乳酸、α-羟丁氨酸、肌酸激酶同工酶、动脉血氧分压、年龄、尿素氮、血红蛋白、乳酸脱氢酶)。用这13个重要变量构建决策树模型,得出2个与病人死亡最相关的变量(白细胞介素-6、淋巴细胞),死亡组病人的白细胞介素-6为155.48(42.81,691.3)ng/L,显著高于存活组15.38(10.51,31.11)ng/L(Z=37387.50,P<0.001)。死亡组病人的淋巴细胞为5.4(3.3,12.6)%,显著低于存活组13.5(8.62,22.28)%(Z=10584.50,P<0.001)。在训练集上的决策树模型预测COVID-19重症病人死亡的AUC为0.86,在测试集上的AUC为0.84。结论基于XGBoost和随机森林这2种机器学习方法构建的决策树模型能够更准确地评估COVID-19重症病人的不良预后。Objective To investigate the risk factors affecting the prognosis of severe COVID-19 patients,to establish and verify predictive models,and then to accurately evaluate the poor prognosis of severe COVID-19 patients.Methods Clinical indicators and outcomes(death or survival within 28 days in hospital)of 526 patients with severe COVID-19 admitted to Cangzhou Central Hospital from November 1,2022 to July 1,2023 were collected.For the R software"caret"package,526 patients were randomly divided into 2 groups in a ratio of 7∶3:the training set(n=369)for model training and the test set(n=157)for model validation.Two machine learning algorithms,eXtreme Gradient Boosting(XGBoost)and random forest(RF),were used to build the prediction model of patient clinical outcome,and SHAP was used to analyze the interpretability of XGBoost model.The variables affecting the prognosis of patients were obtained respectively.The intersection of variables obtained by RF and XGBoost was used to obtain variables with significant differences,and then the decision tree model is constructed.Finally,Receiver operating curve(ROC curve)and Area under curve(AUC)were used to evaluate the predictive performance of the decision tree model on training set and test set.Results XGBoost model obtained 15 variables related to in-hospital death,and random forest model obtained 23 variables related to in-hospital death.At the intersection of the two models,13 important variables with the strongest correlation with nosocomial death were obtained(IL-6,NT-BNP,ALB,CT-NI,LYMPH,Lac,HBDH,CK-MB,PO2,Age,BUN,HB,LDH).A decision tree model was constructed with these 13 important vari-ables,and the 2 variables most related to patient death(IL-6,LYMPH)were obtained.The IL-6 level of patients in the death group was 155.48(42.81,691.3)ng/L,significantly higher than that of the survival group,which was 15.38(10.51,31.11)ng/L(Z=37387.50,P<0.001).The Lymphocyte count of patients in the death group was 5.4(3.3,12.6)%,significantly lower than that of the survival group,which w
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...