检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:胡鹤还 孟军[1] 赵思远 纪腾其 HU Hehuan;MENG Jun;ZHAO Siyuan;JI Tengqi(School of Computer Science and Technology, Dalian University of Technology, Dalian 116023, China)
机构地区:[1]大连理工大学计算机科学与技术学院,辽宁大连116023
出 处:《郑州大学学报(理学版)》2022年第1期12-18,共7页Journal of Zhengzhou University:Natural Science Edition
基 金:国家自然科学基金项目(61872055)。
摘 要:长非编码RNA(lncRNA)是一类不编码蛋白、长度大于200 nt的非编码RNA。然而,最近研究表明,部分lncRNA中含有不超过300 nt的短开放阅读框(sORFs),具备编码小肽的能力。这一发现使得sORFs编码小肽(SEPs)这一崭新的研究领域引起人们的重视。目前,对SEPs的研究大多采用生物实验和传统机器学习方法。由于生物实验方法造价高、耗时长、传统机器学习涉及过多人工干预,提出一种结合多尺度卷积胶囊网络的深度学习模型,既能够充分提取序列特征,又通过胶囊间的连接进行特征聚类。采用五折交叉验证评估模型性能,在苔藓数据集上与单一深度学习模型和简单融合深度学习模型相比,取得较好的分类效果。另外,采用拟南芥、大豆两个物种的数据集进行独立测试,验证了模型具有良好的泛化能力。Long non-coding RNA(lncRNA)is a type of non-coding RNA with a length of 200 nt that has no ability to code for protein.However,it has been shown that some lncRNAs contain short open reading frames(sORFs)of no more than 300 nt,which have the ability to encode small peptides.This discovery has made the new research field of sORFs-encoding peptides(SEPs)arouse people′s attention.Biological experiments and traditional machine learning methods were mostly used by the researchers of SEPs.Due to the high cost and time-consuming of biological experiment methods,and the traditional machine learning methods involving too many manual interventions,a deep learning model combined with multi-scale convolutional capsule networks was proposed.It could not only fully extract sequence features,but also cluster features through the connection between capsules.Compared with the single deep learning model and the simple fusion deep learning model,the performance of the proposed model was better,which was evaluated by 5-fold cross validation on the datasets of Physcomitrella patens.In addition,the datasets of Arabidopsis thaliana and Glycine max were used to test the model independently,which verified the good generalization ability of the model.
关 键 词:胶囊网络 长非编码RNA 短开放阅读框 小肽 预测
分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.218.161.96