5种机器学习模型对胰十二指肠切除术后医院感染风险预测的效能  

Efficiencies of 5 types of machine learning models in prediction of risk of hospital-associated infection in patients undergoing pancreaticoduodenectomy

在线阅读下载全文

作  者:张金卷[1] 王毅军[1] 孙伟[1] 王素梅[2] 刘辉[1] 刘昌利[1] 崔巍 褚成龙 沈云志[1] ZHANG Jin-juan;WANG Yi-jun;SUN Wei;WANG Su-mei;LIU Hui;LIU Chang-li;CUI Wei;CHU Cheng-long;SHEN Yun-zhi(The Third Central Hospital of Tianjin,Tianjin Key Laboratory of Eatracorporeal Life Support for Critical Diseases,Artificial Cell Engineering Technology Research Center,Tianjin Institute of Hepatobiliary Disease,Tianjin 300170,China)

机构地区:[1]天津市第三中心医院肝胆胰外科,天津市重症疾病体外生命支持重点实验室,天津市人工细胞工程技术研究中心,天津市肝胆疾病研究所,天津300170 [2]天津市第三中心医院检验科,天津300170

出  处:《中华医院感染学杂志》2025年第2期235-240,共6页Chinese Journal of Nosocomiology

基  金:天津市卫健委重点攻关课题(15KG114);天津市卫健委面上项目(TJWJ2022MS021);天津市科技支撑项目(20YFZC SY00310)。

摘  要:目的基于不同机器学习(ML)算法构建胰十二指肠切除术(PD)术后医院感染风险预测模型,为识别高风险患者及其临床治疗提供决策支持。方法随机选取2016年1月—2023年7月在天津第三中心医院行PD的228例,按7:3的比例随机数字表法将患者分为159例训练集(感染患者44例,未感染患者115例)和69例测试集(感染患者21例,未感染患者48例),在训练集中通过Lasso回归分析筛选临床变量,使用logistic回归、XGBoost、随机森林(RF)、支持向量机(SVM)和多层感知神经网络(MLP)算法建立训练集数据模型,模型诊断性能利用受试者工作特性(ROC)曲线、曲线下面积(AUC)、准确度、灵敏度、特异度、阳性预测值、阴性预测值、F1分数和Kappa值等指标进行评价。结果228例患者中,65例发生术后医院感染,感染率为28.51%。基于十折交叉验证的Lasso回归筛选出6个临床变量,包括谷草转氨酶(AST)、饮酒史、C-反应蛋白(CRP)、胰瘘、胆瘘和胃排空延迟。基于上述临床变量构建5种ML模型,在训练集中XGBoost和RF在所有模型中表现最佳,二者的ROC-AUC、cutoff、准确度、灵敏度、特异度、阳性预测值、阴性预测值、F1分数和Kappa值分别为1.000和1.000、0.509和0.475、0.992和0.987、1.000和0.997、0.995和0.990、1.000和0.989、0.989和0.986、1.000和0.993、0.980和0.966。在测试集表现最佳者为RF(ROC-AUC=0.773,95%CI:0.581~0.965),XGBoost(ROC-AUC=0.704,95%CI:0.504~0.904),可能存在过拟合现象。结论RF模型是诊断PD术后医院感染风险的最优模型,有助于识别高风险患者,为临床治疗提供决策支持。OBJECTIVE To identify the pancreaticoduodenectomy(PD)patients at high-risk of postoperative hospital-associated infection(HAI)on basis of the risk prediction models that were established based on different machine learning(ML)algorithms so as to provide decisional support for clinical treatment.METHODS A total of 228 patients who underwent PD in the Third Central Hospital of Tianjin from Jan.2016 to Jul.2023 were randomly enrolled in the study and were randomly divided into the training set with 159 cases(44 patients with infection,115 patients without infection)and the test set(21 patients with infection,48 patients without infection)in a 7∶3 ratio.The clinical variables were screened out form the training set through Lasso regression analysis.The data models for the training set were established by using logistic regression,XGBoost,random Forest(RF),support vector machine(SVM)and multilayer perceptual neural network(MLP)algorithms.The diagnostic efficiencies of the models were evaluated by means of receiver operating characteristic(ROC)curves,area under the curve(AUC),accuracy,sensitivity,specificity,positive predictive value,negative predictive value,F1 score and Kappa value.RESULTS Among the 228 patients,65 had postoperative infection,with the infection rate 28.51%.Totally 6 clinical variables were screened out by Lasso regression based on ten-fold cross validation,including glutamic oxalacetic transaminase(AST),drinking history,C-reactive protein(CRP),pancreatic fistula,biliary fistula and delayed gastric emptying.Totally 5 types of ML models were established based on the above clinical variables,XGBoost and RF performed best in the training set among all the models.The ROC-AUC,cutoff,accuracy,sensitivity,specificity,positive predictive value,negative predictive value,F1 score and Kappa value of XGBoost were 1.000,0.509,0.992,1.000,0.995,1.000,0.989,1.000 and 0.980,respectively;the ROC-AUC,cutoff,accuracy,sensitivity,specificity,positive predictive value,negative predictive value,F1 score and Kappa value o

关 键 词:医院感染 风险因素 机器学习 预测模型 诊断效能 胰十二指肠切除术 

分 类 号:R619.3[医药卫生—外科学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象