基于集成学习的肿瘤药物敏感性预测研究  被引量:2

Predicting anti-tumor drug sensitivity based on ensemble learning

在线阅读下载全文

作  者:黄鹏杰 林勇[1] 张梦欢 吕琳 刘振浩 裴潇倜 许林锋 谢鹭[2] HUANG Pengjie;LIN Yong;ZHANG Menghuan;LÜLin;LIU Zhenhao;PEI Xiaoti;XU Linfeng;XIE Lu(School of Medical Instrument and Food Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China;Shanghai Center for Bioinformation Technology,Shanghai 201203,China;Center for Excellence in Molecular Cell Science,Chinese Academy of Sciences,Shanghai 200031,China)

机构地区:[1]上海理工大学医疗器械与食品学院,上海200093 [2]上海生物信息技术研究中心,上海201203 [3]中国科学院分子细胞科学卓越创新中心,上海200031

出  处:《中国医学物理学杂志》2021年第4期511-517,共7页Chinese Journal of Medical Physics

基  金:国家自然科学基金青年科学基金(31800700);国家自然科学基金(31301092);上海市卫计委协同创新集群项目(2019CXJQ02)。

摘  要:肿瘤药物敏感性预测对个性化精准用药具有重要意义。本文基于GDSC数据库通过Boosting集成学习构建了面向RNA-seq基因表达和癌症药物敏感性数据的预测模型。先将183种药物集分别做归一化处理和基因特征降维,接着用AdaBoost集成SVM的方法建模,并采用十折交叉验证。实验结果表明构建的预测模型具有较高的预测精度,13种药物的AUC大于0.95,108种大于0.9,174种大于0.8。对比验证实验中,AdaBoost+SVM相比单学习器模型在整体药物集的综合评价指标中约提高4%,与其他集成模型相比提高2%。同时本文探讨了药物特异性,通过特征选择和富集分析对药物作用通路进行验证,从生物学角度提供了模型可解释性,证明其应用于临床用药指导的价值。The prediction of anti-tumor drug sensitivity is of great significance for personalized and precise medication.Herein a prediction model for RNA-seq gene expression and anti-cancer drug sensitivity data is established based on GDSC database through Boosting ensemble learning.A total of 183 drug sets are normalized,and gene feature dimensionality is reduced.Then,AdaBoost+SVM is used for modeling,and 10-fold cross validation for verifying.The experimental results show that the established prediction model has a high prediction accuracy.The AUC of 13,108 and 174 drugs are greater than 0.95,0.90 and 0.80,respectively.AdaBoost+SVM improves the comprehensive evaluation index of the overall drug set by about 4%and 2%,compared with the models based on a learner only and other ensemble models.Meanwhile,drug specificity is also discussed;and the signal pathway of specific drug is verified through feature selection and enrichment analysis;and the interpretability of the established model is confirmed from a biological perspective.In sum,the value of the established model in clinical medication guidance is proved in the study.

关 键 词:集成学习 肿瘤 药物敏感性预测 ADABOOST 富集分析 

分 类 号:R318[医药卫生—生物医学工程] R917[医药卫生—基础医学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象