机构地区:[1]南通大学附属医院临床研究中心,南通大学医学院,江苏南通226001 [2]北京大学第三医院胸外科,北京100191 [3]南通大学附属医院病理科,江苏南通226001 [4]南通大学附属医院胸心外科,江苏南通226001
出 处:《中国胸心血管外科临床杂志》2025年第1期67-72,共6页Chinese Journal of Clinical Thoracic and Cardiovascular Surgery
摘 要:目的 探讨基于SHOX2和RASSF1A甲基化水平的机器学习算法预测早期肺腺癌病理类型的准确性。方法 回顾性分析2021年1月—2023年1月在南通大学附属医院接受肺部肿瘤切除手术患者的石蜡包埋(formalin-fixed paraffin-embedded,FFPE)标本。根据肿瘤的病理学分类,将患者分为3组:良性肿瘤/原位腺癌(benign tumor/adenocarcinoma in situ,BT/AIS)组、微浸润腺癌(minimally invasive adenocarcinoma,MIA)组和浸润性腺癌(invasive adenocarcinoma,IA)组。使用LungMe试剂盒通过甲基化特异性PCR(MS-PCR)测量FFPE标本的SHOX2和RASSF1A甲基化水平。以SHOX2和RASSF1A的甲基化水平为预测变量,采用机器学习算法(包括逻辑回归、XGBoost、随机森林、朴素贝叶斯)预测不同的肺腺癌病理类型,并构建网络服务器供临床使用。结果 共纳入272例患者,BT/AIS组、MIA组和IA组患者的平均年龄分别为57.97岁、61.31岁和63.84岁;女性患者占比分别为55.38%、61.11%和61.36%。基于SHOX2和RASSF1A甲基化水平建立的早期肺腺癌预测模型中,随机森林与XGBoost模型在预测各病理类型时表现良好。随机森林模型的C统计量在BT/AIS组、MIA组和IA组分别为0.71、0.72和0.78。XGBoost模型的C统计量在BT/AIS组、MIA组和IA组分别为0.70、0.75和0.77。朴素贝叶斯模型仅在IA组表现较为稳健,C统计量为0.73,具有一定的预测能力。逻辑回归模型在各组中的表现最差,对各组均无预测能力。通过决策曲线分析,随机森林模型在BT/AIS和MIA病理类型的预测中展示了较高的净收益,表明其在临床应用中具有潜在价值。结论 基于SHOX2和RASSF1A甲基化水平的机器学习算法预测早期肺腺癌病理类型具有较高的准确性。Objective To explore the accuracy of machine learning algorithms based on SHOX2 and RASSF1A methylation levels in predicting early-stage lung adenocarcinoma pathological types.Methods A retrospective analysis was conducted on formalin-fixed paraffin-embedded(FFPE)specimens from patients who underwent lung tumor resection surgery at Affiliated Hospital of Nantong University from January 2021 to January 2023.Based on the pathological classification of the tumors,patients were divided into three groups:a benign tumor/adenocarcinoma in situ(BT/AIS)group,a minimally invasive adenocarcinoma(MIA)group,and an invasive adenocarcinoma(IA)group.The methylation levels of SHOX2 and RASSF1A in FFPE specimens were measured using the LungMe kit throughmethylation-specific PCR (MS-PCR). Using the methylation levels of SHOX2 and RASSF1A as predictive variables,various machine learning algorithms (including logistic regression, XGBoost, random forest, and naive Bayes) wereemployed to predict different lung adenocarcinoma pathological types. Results A total of 272 patients were included.The average ages of patients in the BT/AIS, MIA, and IA groups were 57.97, 61.31, and 63.84 years, respectively. Theproportions of female patients were 55.38%, 61.11%, and 61.36%, respectively. In the early-stage lung adenocarcinomaprediction model established based on SHOX2 and RASSF1A methylation levels, the random forest and XGBoost modelsperformed well in predicting each pathological type. The C-statistics of the random forest model for the BT/AIS, MIA, andIA groups were 0.71, 0.72, and 0.78, respectively. The C-statistics of the XGBoost model for the BT/AIS, MIA, and IAgroups were 0.70, 0.75, and 0.77, respectively. The naive Bayes model only showed robust performance in the IA group,with a C-statistic of 0.73, indicating some predictive ability. The logistic regression model performed the worst among allgroups, showing no predictive ability for any group. Through decision curve analysis, the random forest modeldemonstrated higher net benefit in
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...