检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:曹阿成 李晓琴[1] 高斌[1] CAO Acheng;LI Xiaoqin;GAO Bin(Faculty of Environment and Life,Beijing University of Technology,Beijing 100124,China)
机构地区:[1]北京工业大学环境与生命学部,北京100124
出 处:《生物信息学》2023年第1期37-44,共8页Chinese Journal of Bioinformatics
基 金:国家重点研发计划资助项目(No.2017YFC0111104);国家自然科学基金资助项目(No.61931013).
摘 要:癌症通常由基因变异的累积所驱动,有效地识别癌症的驱动突变是一个巨大的挑战。目前已有方法更多是通过将基因组区域中观察到的突变率与背景突变率(BMR)预期的突变率进行比较或功能影响测试来识别驱动基因,该驱动基因本质上是存在统计异常的基因。而且并未对已有明确分类的癌症的子类之间驱动基因进行研究。本文引入关联规则算法,探寻发生该基因突变诱使病人患该子类低级别脑胶质瘤的有效规则,将突变数据与患癌结果通过算法建立关系,再通过支持度、置信度和提升度这三个指标对产生的规则进行筛选和评估,来预测候选驱动基因以及类间驱动基因差异。最后利用491例低级别脑胶质瘤体细胞突变数据,得到22个与结果存在关联的驱动基因及其所属的子类,敏感性和假阳性结果优于目前已有的单一算法,且22个基因均具有重要的生物学功能。同时建立了基于22个基因的低级别脑胶质瘤子类识别方法,模型总体准确率达98.99%,方法可有效区分三子类。Cancer is often driven by the accumulation of genetic variants,and effectively identifying the driver mutations in cancer is a great challenge.The current methods of identifying driver genes mainly include comparing observed mutation rates in regions of the genome with those predicted from background mutation rates(BMR)or conducting functional impact tests,and the genes are essentially statistically abnormal genes.Besides,driver genes between subclasses of well-defined cancers have not been studied.In this study,an association rule algorithm was introduced to explore the effective rules for the occurrence of this gene mutation that induces patients to suffer from this subtype of low-grade glioma,and the relationship between the mutation data and the results of cancer was established through the algorithm.Then,three metrics of support,confidence,and lift were used to screen and evaluate the obtained rules to predict candidate driver genes as well as between-class driver gene differences.Finally,using the somatic mutation data of 491 cases of low-grade gliomas,we obtained 22 driver genes associated with the results and their subclasses.The sensitivity and false-positive results were better than the existing single algorithm,and the 22 genes had important biological functions.At the same time,a subclass identification method of low-grade glioma based on the 22 genes was established.The overall model accuracy rate was 98.99%,and the method could effectively distinguish three subclasses.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.117.241.170