检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李昕 贾韬 LI Xin;JIA Tao(College of Computer and Information Science,Southwest University,Chongqing 400715,China)
机构地区:[1]西南大学计算机与信息科学学院,重庆400715
出 处:《计算机应用》2022年第11期3404-3412,共9页journal of Computer Applications
基 金:教育部中国高校产学研创新基金资助项目(2021ALA03016)。
摘 要:针对使用大规模组蛋白修饰(HM)数据预测基因差异性表达(DGE)时未合理利用细胞型特异性(CS)和细胞型间异同两类信息,且输入规模大、计算量高等问题,提出一种深度学习方法dcsDiff。首先,使用多个自编码器(AE)和双向长短时记忆(Bi‑LSTM)网络降维,并建模HM信号得到嵌入表示;然后,利用多个卷积神经网络(CNN)分别挖掘每类CS的HM组合效应以及两细胞型间每种HM的异同信息和所有HM的联合影响;最后,融合两类信息预测两细胞型间的DGE。在对REMC数据库中10对细胞型的实验中,与DeepDiff相比,dcsDiff的预测DGE的皮尔逊相关系数(PCC)最高提升了7.2%、平均提升了3.9%,准确检测出差异表达基因的数量最多增加了36、平均增加了17.6,运行时间节省了78.7%;进一步的成分分析实验证明了合理整合上述两类信息的有效性;并通过实验确定了算法的参数。实验结果表明dcsDiff能有效提高DGE预测的效率。Concering the problem that the Cell type‑Specificity(CS)and similarity and difference information between different cell types are not properly used when predicting Differential Gene Expression(DGE)with large‑scale Histone Modification(HM)data,as well as large volume of input and high computational cost,a deep learning‑based method named dcsDiff was proposed.Firstly,multiple AutoEncoders(AEs)and Bi‑directional Long Short‑Term Memory(Bi‑LSTM)networks were introduced to reduce the dimensionality of HM signals and model them to obtain the embedded representation.Then,multiple Convolutional Neural Networks(CNNs)were used to mine the HM combined effects in each single cell type,and the similarity and difference information of each HM and joint effects of all HMs between two cell types.Finally,the two kinds of information were fused to predict DGE between two cell types.In the comparison experiments with DeepDiff on 10 pairs of cell types in the REMC(Roadmap Epigenomics Mapping Consortium)database,the Pearson Correlation Coefficient(PCC)of dcsDiff in DGE prediction was increased by 7.2%at the highest and 3.9%on average,the number of differentially expressed genes accurately detected by dcsDiff was increased by 36 at most and 17.6 on average,and the running time of dcsDiff was saved by 78.7%.The validity of reasonable integration of the above two kinds of information was proved in the component analysis experiment.The parameters of dcsDiff were also determined by experiments.Experimental results show that the proposed dcsDiff can effectively improve the efficiency of DGE prediction.
关 键 词:组蛋白修饰 基因差异性表达 细胞型特异性 自编码器 双向长短时记忆网络 信息融合 表观遗传学
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.170