基于随机生存森林算法的NK/T细胞淋巴瘤患者预后模型的构建与验证  

Construction and validation of a prognostic model for NK/T-cell lymphoma based on random survival forest algorithm

在线阅读下载全文

作  者:黄侦 伍亚舟[1] HUANG Zhen;WU Yazhou(Department of Health Statistics,Faculty of Military Preventive Medicine,Army Medical University(Third Military Medical University),Chongqing;Center for Hematology,First Affiliated Hospital,Army Medical University(Third Military Medical University),Chongqing,China)

机构地区:[1]陆军军医大学(第三军医大学)军事预防医学系军队卫生统计学教研室,重庆 [2]陆军军医大学(第三军医大学)第一附属医院血液病中心,重庆

出  处:《陆军军医大学学报》2025年第3期275-284,共10页Journal of Army Medical University

基  金:国家自然科学基金面上项目(82173621,81872716)。

摘  要:目的探讨NK/T细胞淋巴瘤(natural killer T-cell lymphoma,NKTL)患者的生存预后影响因素。基于随机生存森林(random survival forest,RSF)算法,构建预测NKTL患者总生存期(overall survival,OS)的预后模型。方法从SEER数据库收集2000-2020年的NKTL患者的人口统计学和临床病理资料。按照7∶3的比例将患者划分为训练队列(n=471)和验证队列(n=203)。通过Cox回归分析确定影响患者OS的预后因素,并基于分析结果构建列线图模型。同时,使用RSF算法确定影响患者OS的预后因素,并构建RSF模型。采用ROC曲线、校准曲线和决策曲线、净重新分类指数(net reclassification improvement,NRI)和综合判别改善指数(integrated discrimination improvement,IDI)对模型的预测能力进行评估,并比较2种模型的预测效果。通过2种模型计算每位患者的风险得分,根据风险得分的中位数,将患者分为高风险组和低风险组,并绘制生存曲线进行比较。结果Ann Arbor分期、年龄、放疗、联合治疗和疾病类型是与生存显著相关的预后变量。在验证队列中,列线图模型的1、3和5年ROC曲线下面积(area under curve,AUC)分别为0.745、0.771和0.748,而RSF模型的AUC为0.764、0.792和0.761。ROC曲线显示2种模型在预测OS方面具有良好的准确性和区分性。校准曲线显示2种模型预测的生存与实际生存之间具有良好的一致性。两种模型均能有效地将患者分为预后差组和预后好组,并且预后差组患者的OS显著低于预后好组(P<0.0001)。决策曲线显示RSF模型的净获益优于列线图模型。与列线图模型相比,RSF模型的NRI为0.184(95%CI:0.098~0.267,P<0.01),IDI为0.300(95%CI:0.241~0.359,P<0.01),RSF模型的预测能力优于列线图模型。结论Ann Arbor分期、年龄、放疗、联合治疗和疾病类型是NKTL患者预后的影响因素,本研究据此建立的RSF模型对NKTL患者预后具有很好的预测能力,可有效评估患者预后。Objective To investigate the prognostic factors affecting survival in patients with natural killer T-cell lymphoma(NKTL),and then develop a prognostic model for predicting their overall survival(OS)based on random survival forest(RSF)algorithm.Methods Demographic and clinical pathological data of NKTL patients were collected from the SEER database during 2000 and 2020.The patients were divided into a training cohort(n=471)and a validation cohort(n=203)in a 7∶3 ratio.Cox regression analysis was performed to identify prognostic factors affecting OS,and a nomogram model was constructed based on the obtained factors.Meanwhile,RSF algorithm was used to determine prognostic factors affecting OS to build the RSF model.The models were evaluated using receiver operating characteristic(ROC)curve,calibration curve,decision curve,net reclassification improvement(NRI),and integrated discrimination improvement(IDI),and the predictive performances of the 2 models were compared.Risk scores for each patient were calculated using the 2 models.Then the patients were divided into high-and low-risk groups based on the median risk score,and survival curve was plotted for comparison.Results Ann Arbor stage,age,radiotherapy,combined treatment,and type of disease were identified as significant prognostic variables associated with OS.In the validation cohort,the area under the ROC curve(AUC)for the nomogram model at 1,3,and 5 years was 0.745,0.771,and 0.748,respectively,while the AUC for the RSF model was 0.764,0.792,and 0.761 at the same time points.ROC curve analysis indicated that both models demonstrated good accuracy and discrimination in predicting OS.Calibration curve analysis showed a strong consistency between the predicted and actual OS for both models.Both models effectively stratified the patients into poor and favorable prognosis groups,with the OS of patients in the poor prognosis group being significantly shorter than that of the favorable prognosis group(P<0.0001).Decision curve analysis revealed that the net benefit of

关 键 词:NK/T细胞淋巴瘤 SEER数据库 列线图 机器学习 随机生存森林 生存预测 

分 类 号:R195.1[医药卫生—卫生统计学] R730.7[医药卫生—卫生事业管理] R733.4[医药卫生—公共卫生与预防医学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象