XGBoost机器学习模型对乙型肝炎肝硬化诊断的应用价值研究  被引量:1

Value of XGBoost machine learning model for diagnosis of hepatitis B cirrhosis

在线阅读下载全文

作  者:李季 韩可兴 沈佳培 孙伟杰 高龙 郜玉峰[1] Ji Li;Ke-Xing Han;Jia-Pei Shen;Wei-Jie Sun;Long Gao;Yu-Feng Gao(Department of Infection,The First Affiliated Hospital of Anhui Medical University,Hefei 230032,Anhui Province,China)

机构地区:[1]安徽医科大学第一附属医院感染病科,安徽省合肥市230032

出  处:《世界华人消化杂志》2023年第13期544-554,共11页World Chinese Journal of Digestology

基  金:安徽省自然科学基金,No.2208085MH204。

摘  要:背景 慢性乙型肝炎病毒感染(chronic hepatitis B virus infection,CHBV)进展成肝硬化是缓慢且容易被忽略的,通过临床常规指标来构建肝硬化无创诊断模型成为研究热点.然而,目前有关肝硬化早期诊断的机器学习模型仍是缺乏的.目的 探讨极限梯度提升机(eXtreme gradient boosting,XGBoost)机器学习模型在乙肝肝硬化无创诊断中的效能.方法 回顾性分析2010-2018年首次就诊于安徽医科大学第一附属医院和第二附属医院感染病科的CHBV患者1087例,按照随机原则以3:1的比例分为训练集和验证集.收集所有研究对象的临床资料并利用XGBoost机器学习模型构建预测模型.同时,计算谷草转氨酶与血小板比率指数(aspartate aminotransferase/platelet ratio index,APRI)、纤维蛋白-4(fibrosis-4 index,FIB-4)评分并与XGBoost机器学习模型进行比较.受试者工作特征曲线下面积(area under curve,AUC)以评估模型区分度,校准曲线(calibration curve,CA)及决策曲线(decision curve analysis,DCA)以评估模型校准度及获益度.结果 共纳入CHBV病例1087例,其中训练集817例,验证集270例.训练集与验证集两组间所有预测变量均无统计学差异(P>0.05).训练集中有103例患者发生肝硬化,肝硬化患者APRI和FIB-4评分明显高于非肝硬化患者(P <0.05).在所有预测因子中血小板的相对重要度最高.训练集和验证集的AUC分别为0.95和0.86(P<0.05),两者Kappa值分别为0.78和0.74,提示模型可重复性较好.CA曲线提示模型预测情况与真实情况拟合情况吻合度较高.训练集和验证集的DCA曲线提示所建立模型能够使患者获得较高的获益度.XGBoost机器学习模型对于肝硬化的效能优于APRI评分和FIB-4评分.结论 本研究利用C H B V患者常见的临床信息构建的XGBoost模型对肝硬化的诊断具有良好的性能,值得临床进一步推广.BACKGROUND The progression of chronic hepatitis B into cirrhosis is slow and easily ignored,and the construction of a noninvasive diagnostic model for cirrhosis based on routine clinical indicators has become a hot research topic.However,there is still a lack of machine learning models regarding the early diagnosis of cirrhosis.AIM To investigate the performance of the extreme gradient boosting(XGBoost)machine model in the diagnosis of hepatitis B cirrhosis.METHODS A retrospective analysis was performed on 1087 patients with chronic hepatitis B virus infection(CHBV)diagnosed for the first time at the Department of Infection,The First/Second Affiliated Hospital of Anhui Medical University from 2010 to 2018.The patients were divided into training and validation sets in a 3:1 ratio according to the randomization principle.Clinical data of all study participants were collected and prediction models were constructed using XGBoost machine learning model.Meanwhile,the aspartate aminotransferase/platelet ratio index(APRI)and fibrosis-4 index(FIB-4)scores were calculated and compared with the XGBoost machine learning model.Area under the curve(AUC)was used to assess the model discrimination,and calibration curve(CA)and decision curve analysis(DCA)were used to assess the model calibration and benefit.RESULTS A total of 1087 CHBV patients were included,including 817 in the training set and 270 in the validation set.There was no statistical difference between the training and validation sets for all predictor variables(P>0.05).Cirrhosis occurred in 103 patients in the training set,and APRI and FIB-4 scores were significantly higher in cirrhotic patients than in non-cirrhotic patients(P<0.05).The relative importance of platelets was the highest among all predictors.The AUCs of the model in the training and validation sets were 0.95 and 0.86(P<0.05),respectively,and the Kappa values were 0.78 and 0.74,which suggested that the model was reproducible.CA curve analysis indicated that the model predicted a high degree of agreement

关 键 词:慢性乙型病毒性肝炎 肝硬化 预测模型 XGBoost机器学习模型 

分 类 号:R512.62[医药卫生—内科学] R575.2[医药卫生—临床医学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象