基于生物信息学开发肿瘤信号通路相关肺腺癌预后模型与关键驱动基因识别  

Development of prognostic model and identification of key driver genes of tumor signaling pathway related to lung adenocarcinoma based on bioinformatics

在线阅读下载全文

作  者:李京耕 车映刚 刘苗艳 张艰 Li Jinggeng;Che Yinggang;Liu Miaoyan;Zhang Jian(Department of Pulmonary and Critical Care Medicine,the First Affiliated Hospital of Air Force Military Medical University,Xi′an 710032,China;Department of Respiratory Medicine,Air Force Hospital of Western War Zone,Chengdu 610000,China;Xi′an Medical University,Xi′an 710021,China;Department of Respiratory Medicine,People′s Hospital of Northwest University(Xi′an Fourth Hospital),Xi′an 710004,China)

机构地区:[1]空军军医大学第一附属医院呼吸与危重症医学科,西安710032 [2]西部战区空军医院呼吸内科,成都610000 [3]西安医学院,西安710021 [4]西北大学附属人民医院(西安市第四医院)呼吸内科,西安710004

出  处:《国际呼吸杂志》2025年第2期143-152,共10页International Journal of Respiration

基  金:陕西省自然科学基础研究计划(2024JC-ZDXM-45)。

摘  要:目的通过生物信息学方法开发与肿瘤信号通路相关的肺腺癌预后模型,并识别与肺腺癌发生相关的关键驱动基因。方法从TCGA数据库中收集450例肺腺癌组织和58例癌旁正常肺组织的转录组表达数据和临床特征,从GEO数据库中收集926例肺腺癌组织的转录组表达数据和临床特征。推断TCGA数据库、GSE30219数据集、GSE50081数据集样本中14种信号通路[雄激素、表皮生长因子受体(EGFR)、雌激素、缺氧、JAK-STAT、促分裂原活化的蛋白质激酶(MAPK)、核因子κB、p53、磷脂酰肌醇3激酶(PI3K)、转化生长因子β、肿瘤坏死因子α、肿瘤坏死因子相关凋亡诱导配体、血管内皮生长因子、Wnt]的活性。单因素Cox回归分析TCGA数据库、GSE30219数据集、GSE50081数据集肺腺癌组织中不同信号通路活性对患者预后的影响。采用Spearman秩相关分析探讨肿瘤信号通路活性与临床分期的相关性。采用一致性聚类方法,基于EGFR、MAPK、PI3K、p53信号通路活性对TCGA数据库肺腺癌样本进行聚类分群。绘制Kaplan-Meier生存曲线比较不同亚群的生存状态。识别C2亚群与其他亚群之间的差异基因,在TCGA数据库、GSE30219数据集和GSE50081数据集中对C2亚群的差异基因进行单因素Cox回归分析,筛选出共表达预后相关基因。基于4种机器学习算法筛选关键预后基因,构建肺腺癌预后模型。利用该预后模型计算肺腺癌患者的风险评分,依据风险评分的最佳截断值将患者分为高危组和低危组。采用Kaplan-Meier生存曲线和受试者操作特征(ROC)曲线进行性能评估。单因素和多因素Cox回归分析肺腺癌患者总生存期(OS)的影响因素。基于模型风险评分和相关临床特征构建列线图,用以预测肺腺癌患者1年、3年、5年生存率,并通过校准曲线和决策曲线分析进行验证。采用孟德尔随机化分析探讨PHF19基因表达与肺腺癌之间的关联性。预测靶向PHF19ObjectiveTo employ bioinformatics methodologies to develop a prognostic model for lung adenocarcinomas focusing on tumor signaling pathways,and to identify key driver genes associated with the pathogenesis of lung adenocarcinomas.MethodsTranscriptome expression data and clinical features of 450 lung adenocarcinoma tissues and 58 adjacent normal lung tissues were obtained from The Cancer Genome Atlas(TCGA)database,and those of 926 lung adenocarcinoma tissues were available from the Gene Expression Omnibus(GEO)database.Activities of fourteen well-characterized signaling pathways,namely the androgen,epidermal growth factor receptor(EGFR),estrogen,hypoxia,Janus kinase/signal transduction and activator of transcription(JAK-STAT),mitogen-activated protein kinase(MAPK),nuclear factorκB,p53,phosphoinositide 3-kinase(PI3K),transforming growth factor-β,tumor necrosis factor-α,TNF-related apoptosis-inducing ligand,vascular endothelial growth factor,and Wnt pathways were predicted by samples from the TCGA,GSE30219 and GSE50081 datasets.Univariate Cox regression analysis was performed to assess the impact of diverse signaling pathway activities on the prognosis of lung adenocarcinoma using samples from the TCGA,GSE30219 and GSE50081 datasets.Subsequently,Spearman rank correlation analysis was employed to investigate the correlation between signaling pathway activity and clinical stage of lung adenocarcinomas.Consensus clustering was utilized to categorize lung adenocarcinoma samples from the TCGA database according to the activities of EGFR,MAPK,PI3K,and p53 signaling pathways.Kaplan-Meier survival curves were then plotted to compare the survival status among different subgroups.Furthermore,differentially expressed genes(DEGs)between the C2 subgroup and other subgroups were identified,and univariate Cox regression analysis was carried out on these differential genes within the TCGA,GSE30219,and GSE50081 datasets to screen for co-expressed prognostic-related genes.Finally,key prognostic genes were selected based on four mac

关 键 词:肺腺癌 肿瘤信号通路 计算生物学 机器学习 预后模型 

分 类 号:R73[医药卫生—肿瘤]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象