检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周嫱 柏娜 刘生刚[2] 刘伟[3] 张宏伟 柳华 ZHOU Qiang;BAI Na;LIU Shenggang;LIU Wei;ZHANG Hongwei;LIU Hua(Department of Neurology,The Affiliated Hospital of South-west Jiaotong University&The Third People's Hospital of Chengdu,Chengdu 610000,China)
机构地区:[1]西南交通大学附属医院,成都市第三人民医院神经内科,成都610000 [2]绵阳市人民医院神经内科 [3]南部县人民医院神经内科
出 处:《中国神经精神疾病杂志》2022年第9期525-532,共8页Chinese Journal of Nervous and Mental Diseases
基 金:四川省卫健委科研课题重点项目(编号:19ZD001);成都市科技局技术创新研发项目(编号:2019-YF05-00014-SN);四川省卫健委科研课题普及应用项目(编号:19PJ010)。
摘 要:目的本研究旨在通过生物信息学和机器学习方法筛选并验证缺血性脑卒中(ischemic stroke,IS)可能的关键风险基因,并探讨这些基因的相关病理生理机制,寻找潜在IS治疗靶点。方法通过检索基因表达综合数据库(Gene Expression Omnibus,GEO)获得来自人类IS患者和健康对照的2个转录组数据集(GSE122709,GSE140275),对数据集GSE122709中的mRNA进行差异表达分析,然后,对差异表达基因(differentially expressed genes,DEGs)进行相关基因本体论(gene ontology,GO)、京都基因与基因组百科全书(Kyoto Encyclopedia of Genes and Genomes,KEGG)和疾病本体(disease ontology,DO)富集分析。通过最小绝对值收敛和选择算子(least absolute shrinkage and selection operator,LASSO)和支持向量机-递归特征消除(support vector machines-recursive feature elimination,SVM-RFE)两种机器学习算法筛选关键基因,并在数据集(GSE140275)中进行验证。结果共鉴定出378个DEGs(176个上调基因和202个下调基因)。通过GO,KEGG富集分析发现DEGs主要与炎症反应、免疫调节、COVID-19、传统IS危险因素等相关。DO富集分析发现DEGs与妇科肿瘤疾病相关。LASSO和SVM-RFE 2种机器学习算法共同识别的基因TVP23C、B3GAT1被确定为IS的特异风险基因。验证数据集分析后发现TVP23C、B3GAT1的IS诊断价值具有显著意义。结论TVP23C、B3GAT1可能是IS相关的关键风险基因。结合B3GAT1的表达分析,提示B3GAT1可能通过调控AMRA谷氨酸受体参与IS缺血脑损伤,为IS的早期诊断和治疗提供理论参考和科学依据。Objective This study aims to screen the feature genes of ischemic stroke(IS) by bioinformatics and machine learning(ML) and explore the possible pathophysiological mechanism of the genes in IS.Methods Two RNA sequencing datasets were downloaded from NCBI Gene Expression Omnibus(GEO) database.The GSE122709 dataset with a larger sample size was used as the training set and analyzed for differentially expressed genes(DEGs),while the GSE 140275 dataset was used as the test set.The DEGs were further analyzed for Gene Ontology(GO),Kyoto Encyclopedia of Genes and Genomes(KEGG),and Disease Ontology(DO) enrichment analyses.Then,feature genes selection was performed by two ML algorithms.The area under the receiver operating characteristic curve(AUC) was used to evaluate the performance of the ML algorithms.Results A total of 378 DEGs(Fold Change≥2 and p value≤0.05) were identified.The GO and KEGG analyses demonstrated that the majority of DEGs was associated with inflammatory response,immune regulation and COVID-19.The DO analysis revealed that the DEGs were mainly linked to demyelinating disease and cancer.The TVP23C and B3GAT1 were identified as feature genes for IS by ML algorithms,and the AUCs of them were closer to one in both training and testing set.Conclusions The integrated approach of bioinformatics and ML could be a novel approach for screening feature genes for IS.B3GAT1 may mediate brain injury of IS through regulating AMRA glutamate receptors,which may be a possible therapeutic target in IS.
关 键 词:缺血性脑卒中 生物信息学 机器学习 GEO数据库 LASSO SVM-RFE 核心基因
分 类 号:R743[医药卫生—神经病学与精神病学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.51