机构地区:[1]国家癌症中心国家肿瘤临床医学研究中心中国医学科学院北京协和医学院肿瘤医院分子肿瘤学国家重点实验室,100021 [2]国家癌症中心国家肿瘤临床医学研究中心中国医学科学院北京协和医学院肿瘤医院肿瘤内科,100021
出 处:《中国综合临床》2020年第3期217-222,共6页Clinical Medicine of China
基 金:国家重点基础研究发展计划(973计划项目)(2015CB553904);中国医学科学院医学与健康创新科技工程项目(2016-I2M-1-001,2019-I2M-1-003);国家自然科学基金(81872280);分子肿瘤学国家重点实验室开放课题(SKL-KF-2017-16);分子肿瘤学国家重点实验室自主创新课题(SKL-2017-16)。
摘 要:目的构建长链非编码RNA(long non-coding RNA,LncRNA)表达特征的乳腺癌患者预后的预测模型。方法分析癌症基因组图谱(the cancer genome atlas,TCGA)数据库1081例乳腺癌患者的转录组测序数据中LncRNA表达图谱及临床特征,对TCGA数据库中112对配对的乳腺癌及正常乳腺组织的转录组测序数据进行差异表达分析和单因素分析筛选得到差异表达且与乳腺癌患者预后显著相关的LncRNA(DELncRNA),利用DEseq2包进行差异表达分析(为减弱批次效应,测序数据已用DESeq函数标准化)。1081例乳腺癌患者被分成两组:训练集(541例)和验证集(540例)。将DELncRNA纳入Cox比例风险回归模型,在训练集中筛选和建立多LncRNA预后模型并对模型进行比例风险假定检验(proportional hazards assumption,PH假定检验),计算多基因风险评分,并基于此将患者分为高风险组和低风险组,采用Kaplan-Meier方法进行生存分析,并用验证集540例患者的数据进行验证。评价该模型在TCGA数据库肺鳞癌和肝细胞肝癌等患者中的预后评估价值。基因集富集分析(gene set enrichment analysis,GSEA)分析LncRNA影响患者生存的具体机制。结果转录组测序分析筛选得到2815个差异表达基因,其中与乳腺癌患者预后显著相关的LncRNA共91个(P<0.05)。利用541例训练集乳腺癌患者的91个DELncRNA表达数据进行Cox回归分析,构建了基于5个LncRNA的Cox比例风险回归模型(训练集AUC=0.746,验证集AUC=0.650):AC004551.1、MTOR-AS1、KCNAB1-AS2、FAM230G和LINC01283,并进行PH假定检验(P=0.388)。K-M生存分析发现,训练集中高风险组的生存明显差于低风险组(中位生存时间:7.049年与12.21年,HR 0.367,95%CI 0.228~0.597,P<0.001),在验证集中高风险组患者生存时间也明显短于低风险组(中位生存时间:7.57年与10.85年,HR 0.412,95%CI 0.214~0.793,P<0.001)。在TCGA其他癌种中也得到相似的预测结果:肺鳞癌(HR 0.604,95%CI 0.383~0.951,P=0.007)及肝细胞�Objective To construct a prediction model for the prognosis of breast cancer patients with long non-coding RNA expression characteristics.Methods To construct a long non-coding RNA(LncRNA)model for predicting the prognosis of breast cancer patients.Methods Analyzing LncRNA expression profiles and clinical characteristics of 1081 breast cancer patients in the cancer genome atlas(TCGA)database.Performing differential expression analysis and univariate analysis on 112 paired breast cancer and normal breast tissues′transcriptome sequencing data in the TCGA database,and screened for differentially expressed(DELncRNAs)that significantly correlated with the prognosis of BRCA(To reduce batch effects,sequencing data has been normalized using the DESeq function).One thousand eighty-one breast cancer patients were randomly divided into two groups:training set(541)and validation set(540).Performing Cox proportional hazard regression using DELncRNAs and establishing a multi-LncRNA prognosis model in the training set,followed by proportional hazards assumption test(PH assumption test).Patients were divided into high-risk and low-risk groups based on calculated risk score.Kaplan-Meier method was used for survival analysis,and 540 patients′data were used for validation.To evaluate the prognostic value of the model in patients with squamous cell carcinoma of the lung and hepatocarcinoma in TCGA database.Gene Set Enrichment Analysis(GSEA)was used to analyze the specific mechanism of lncrna affecting the survival of patients.Results There were 2815 differentially expressed genes screened by transcriptome sequencing,91 of which were significantly related to the prognosis of breast cancer patients(P<0.05).Based on the Cox regression analysis of 91 delncrna expression data from 541 breast cancer patients in training set,a Cox proportional risk regression model was constructed based on 5 LncRNA(training set AUC=0.746,validation set AUC=0.650):AC004551.1,MTOR-AS1,KCNAB1-AS2,FAM230G and LINC01283,and PH assumption test(P=0.388).K-M s
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...