机构地区:[1]Department of Gastroenterology,Beijing Key Laboratory for Helicobacter Pylori Infection and Upper Gastrointestinal Diseases,Peking University Third Hospital,Beijing 100191,China [2]Beijing Aerospace Wanyuan Science Technology Co.,Ltd.,China Academy of Launch Vehicle Technology,Beijing 100176,China [3]Department of Gastroenterology,The Affiliated Hospital of Qingdao University,Qingdao 266003,Shandong Province,China [4]Institute of Automation,Qilu University of Technology(Shandong Academy of Sciences),Jinan 250014,Shandong Province,China
出 处:《World Journal of Gastrointestinal Oncology》2024年第12期4597-4613,共17页世界胃肠肿瘤学杂志(英文)
基 金:Supported by National Natural Science Foundation of China,No.81802777.
摘 要:BACKGROUND Colorectal cancer(CRC)is characterized by high heterogeneity,aggressiveness,and high morbidity and mortality rates.With machine learning(ML)algorithms,patient,tumor,and treatment features can be used to develop and validate models for predicting survival.In addition,important variables can be screened and different applications can be provided that could serve as vital references when making clinical decisions and potentially improving patient outcomes in clinical settings.AIM To construct prognostic prediction models and screen important variables for patients with stageⅠtoⅢCRC.METHODS More than 1000 postoperative CRC patients were grouped according to survival time(with cutoff values of 3 years and 5 years)and assigned to training and testing cohorts(7:3).For each 3-category survival time,predictions were made by 4 ML algorithms(all-variable and important variable-only datasets),each of which was validated via 5-fold cross-validation and bootstrap validation.Important variables were screened with multivariable regression methods.Model performance was evaluated and compared before and after variable screening with the area under the curve(AUC).SHapley Additive exPlanations(SHAP)further demonstrated the impact of important variables on model decision-making.Nomograms were constructed for practical model application.RESULTS Our ML models performed well;the model performance before and after important parameter identification was consistent,and variable screening was effective.The highest pre-and postscreening model AUCs 95%confidence intervals in the testing set were 0.87(0.81-0.92)and 0.89(0.84-0.93)for overall survival,0.75(0.69-0.82)and 0.73(0.64-0.81)for disease-free survival,0.95(0.88-1.00)and 0.88(0.75-0.97)for recurrence-free survival,and 0.76(0.47-0.95)and 0.80(0.53-0.94)for distant metastasis-free survival.Repeated cross-validation and bootstrap validation were performed in both the training and testing datasets.The SHAP values of the important variables were consistent with the clinicopath
关 键 词:Colorectal cancer Machine learning Prognostic prediction model Survival times Important variables
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...