基于CT图像组学特征的机器学习模型预测临床ⅠA期肺腺癌预后的研究  被引量:2

Machine learning models based on radiomics features from CT images for predicting prognosis of clinical stageⅠA lung adenocarcinoma

在线阅读下载全文

作  者:刘梦雯 蒋旭 张雪 姜九明 邱斌 李蒙 张丽 Liu Mengwen;Jiang Xu;Zhang Xue;Jiang Jiuming;Qiu Bin;Li Meng;Zhang Li(Department of Diagnostic Radiology,National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital,Chinese Academy of Medical Sciences and Peking Union Medical College,Beijing 100021,China;Department of Thoracic Surgery,National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital,Chinese Academy of Medical Sciences and Peking Union Medical College,Beijing 100021,China)

机构地区:[1]国家癌症中心、国家肿瘤临床医学研究中心、中国医学科学院北京协和医学院肿瘤医院影像诊断科,北京100021 [2]国家癌症中心、国家肿瘤临床医学研究中心、中国医学科学院北京协和医学院肿瘤医院胸外科,北京100021

出  处:《数字医学与健康》2023年第2期82-89,共8页DIGITAL MEDICINE AND HEALTH

基  金:北京市自然科学基金项目(7184238);中国医学科学院医学与健康科技创新工程项目(2021‑I2M‑C&T‑B‑061,2022‑I2M‑C&T‑B‑076);吴阶平医学基金会卓越外科基金(320.320.2730.1867)。

摘  要:目的探讨基于CT图像组学特征的不同机器学习方法在预测临床ⅠA期肺腺癌预后的价值。方法回顾性分析2005年5月至2018年12月在中国医学科学院肿瘤医院接受手术切除后,出现复发转移或随访满5年未出现复发转移的临床ⅠA期肺腺癌患者541例,采用简单随机抽样法,按照7∶3比例随机将患者划分为训练集(n=379)和测试集(n=162)。基于原发肿瘤术前CT图像提取影像组学特征,采用最大相关最小冗余算法、最小绝对收缩和选择算法筛选特征。在此基础上,构建7种机器学习模型,分别为贝叶斯、决策树、K临近、逻辑回归、随机森林、支持向量机和极限梯度提升(XGB)。通过对模型在测试集上进行10折交叉验证,计算平均曲线下面积(AUC)值,选出性能最佳的模型。对临床特征进行单因素和多因素Cox回归分析,以确定具有统计学意义的临床特征。基于最佳影像组学模型的算法,结合影像组学特征和临床特征,建立联合模型,采用平均AUC值、准确度、精确度、灵敏度、特异度、F1度量值、校准曲线、Logistic Loss值、决策曲线分析(DCA)和平均最佳净收益评估影像组学模型与联合模型的效能。结果筛选出了13个影像组学特征,在测试集上XGB影像组学模型的平均AUC值最高,为0.914。联合模型将AUC提高到0.956,准确度为90.0%,精确度为97.7%,灵敏度为90.5%,特异度为86.7%,F1值为93.9%。DCA显示联合模型临床效果最佳。结论基于影像组学特征的XGB模型对预测临床ⅠA期肺腺癌的预后具有较高的价值,并且基于影像组学特征和临床特征的联合模型效能更好,有助于临床决策。Objective To explore the value of different machine learning methods based on radiomics features extracted from CT images in predicting the prognosis of stageⅠA lung adenocarcinoma.Methods A retrospective analysis was performed on 541 patients with clinical stageⅠA lung adenocarcinoma who underwent surgical resection at the Cancer Hospital of Chinese Academy of Medical Sciences,from May 2005 and December 2018;either experienced recurrence or metastasis or at least 5 years follow‑up showed no recurrence.Patients were divided into a training set(n=379)and a test set(n=162)against a 7∶3 ratio.Radiomics features were extracted from the preoperative CT images of the primary tumors;and feature selection was performed using the maximum relevance minimum redundancy method and the least absolute shrinkage and selection operator.Seven machine learning models were created based on these features,including Bayesian,decision tree,K‑nearest neighbor,logistic regression,random forest,support vector machine,and extreme gradient boosting(XGB).By evaluating the average area under the curve(AUC)of the model on the test set,using 10‑fold cross‑validation,the model was selected with the best performance.Univariate and multivariate Cox regression analyses were performed on the clinical features to identify the ones with statistical significance.Based on the algorithm of the best model,a combined model was established by integrating radiomics and clinical features.The performance of the radiomics model and the combined model was reviewed against average AUC,accuracy,precision,sensitivity,specificity,F1 score,calibration curve,logistic loss value,decision curve analysis(DCA)and the optimal average net benefit.Results Thirteen radiomics features were selected,and the XGB model achieved the highest AUC of 0.914 in the test set.The combined model further improved the average AUC to 0.956,with an accuracy of 90.0%,a precision of 97.7%,a sensitivity of 90.5%,a specificity of 86.7%,and a F1 score of 93.9%.DCA showed that the com

关 键 词:肺腺癌 体层摄影术 影像组学 机器学习 

分 类 号:R734.2[医药卫生—肿瘤]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象