机构地区:[1]安徽医科大学附属六安医院肿瘤中心放疗科,六安237000
出 处:《国际肿瘤学杂志》2025年第1期31-37,共7页Journal of International Oncology
基 金:六安市科技计划(2022lakj042)。
摘 要:目的基于机器学习(ML)算法构建食管癌患者同步放化疗(CRT)过程中发生≥2级放射性食管炎(RE)的预测模型。方法回顾性分析2018年1月至2023年1月在安徽医科大学附属六安医院接受CRT的276例食管癌患者的临床资料,根据美国放射肿瘤治疗协作组RE分级标准评估是否发生RE,以发生≥2级RE为结局事件。通过最小绝对值收敛和选择算子(LASSO)回归筛选变量后重新建立数据集,将数据集以7∶3的比例分为训练集(n=193)及测试集(n=83),纳入到随机森林(RF)、决策树(DT)、极端梯度提升(XGboost)、支持向量机(SVM)4种ML模型中。在训练集中进行数据训练、模型优化,测试集用于受试者操作特征(ROC)曲线评价模型效果,计算曲线下面积(AUC)、精确度、准确度、敏感性、F1分数对模型进行评估,使用SHAP分析解释最优模型。结果至随访结束,91例(32.97%)食管癌患者在CRT期间发生≥2级RE。≥2级RE发生组(n=91)与未发生组(n=185)患者肿瘤病灶长径(Z=-5.53,P<0.001)、Karnofsky功能状态(KPS)评分(χ^(2)=5.92,P=0.015)、美国东部肿瘤协作组(ECOG)评分(χ^(2)=4.01,P=0.045)、高血压(χ^(2)=15.35,P<0.001)、糖尿病(χ^(2)=13.06,P<0.001)、白细胞计数(Z=-6.59,P<0.001)、中性粒细胞计数(Z=-6.72,P<0.001)、放疗剂量(χ^(2)=9.81,P=0.002)差异均具有统计学意义。经过LASSO回归筛选最终选择出7个特征变量,分别为肿瘤病灶长径、ECOG评分、KPS评分、中性粒细胞计数、高血压、糖尿病、放疗剂量。ROC曲线分析显示,XGBoost模型预测性能较好,其AUC为0.90、准确度为0.82、精确度为0.80、敏感性为0.73、FI分数为0.76,RF模型AUC为0.89、准确度为0.78、精确度为0.76、敏感性为0.48、FI分数为0.59,DT模型AUC为0.72、准确度为0.72、精确度为0.44、敏感性为0.60、FI分数为0.52,SVM模型AUC为0.74、准确度为0.82、精确度为0.52、敏感性为0.88、FI分数为0.65。通过SHAP分析对XGBoost模型进行解释,�Objective To construct a predictive model of≥grade 2 radiation esophagitis(RE)in patients with esophageal cancer during concurrent radiochemotherapy(CRT)based on machine learning(ML)algorithm.Methods A retrospective analysis was conducted on the clinical data of 276 patients with esophageal cancer who had received CRT at Lu′an Hospital of Anhui Medical University from January 2018 to January 2023.The occurrence of RE was evaluated according to grading criteria of RE developed by American Radiation Therapy Oncology Group,with≥grade 2 RE as the outcome event.After screening variables through the least absolute shrinkage and selection operator(LASSO)regression,the dataset was re-established.The dataset was then divided into training set(n=193)and testing set(n=83)in a 7∶3 ratio and included in four ML models:random forest(RF),decision tree(DT),extreme gradient boosting(XGBoost),and support vector machine(SVM).In the models,data training and model optimization were conducted in the training set,and model performance was evaluated in the testing set using the receiver operator characteristic(ROC)curve.The area under the curve(AUC),accuracy,precision,sensitivity,and F1 score were calculated to assess the model.SHAP analysis was used to explain the optimal model.Results By the end of follow-up,91 cases(32.97%)of esophageal cancer patients had experienced≥grade 2 RE during CRT.There were statistically significant differences in tumor lesion length(Z=-5.53,P<0.001),Karnofsky performance status(KPS)score(χ^(2)=5.92,P=0.015),the Eastern Cooperative Oncology Group(ECOG)score(χ^(2)=4.01,P=0.045),hypertension(χ^(2)=15.35,P<0.001),diabetes(χ^(2)=13.06,P<0.001),white blood cell count(Z=-6.59,P<0.001),neutrophil count(Z=-6.72,P<0.001),and radiotherapy dose(χ^(2)=9.81,P=0.002)between≥grade 2 RE occurrence group(n=91)and no occurrence group(n=185).After LASSO regression screening,7 characteristic variables were ultimately selected,which were tumor lesion length,ECOG score,KPS score,neutrophil count,hypertension,dia
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...