检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:贾建华 陈天 吴跟强 孙明炜 JIA Jian-Hua;CHEN Tian;WU Gen-Qiang;SUN Ming-Wei(Bioinformatics Research Laboratory,School of Information Engineering,Jingdezhen Ceramic University,Jingdezhen 333403,Jiangxi,China)
机构地区:[1]景德镇陶瓷大学信息工程学院生物信息研究室,江西景德镇333403
出 处:《中国生物化学与分子生物学报》2023年第6期889-895,共7页Chinese Journal of Biochemistry and Molecular Biology
基 金:国家自然科学基金项目(No.61761023,31760315);江西省自然科学基金项目(No.20202BABL202004,20202BAB202007);江西省教育厅科研计划(No.GJJ190695,No.GJJ212419)资助。
摘 要:N6,2′-O-二甲基腺苷(m^(6)Am)是一种常见的RNA分子的可逆修饰。部分研究已经说明m^(6)Am对mRNA的影响,但现阶段对m^(6)Am的生物学功能探索仍不够。所以我们提出了m^(6)AmTwins,一种新的端到端双胞胎网络,将Transformer(自动编码器)和双向门控循环单元(Bi-GRU)有机结合,简单利用RNA序列得到RNA的检测性。相比于现有的算法,本文亮点在于利用对比学习,构建新的损失函数来训练m^(6)AmTwins模型,提高了模型的泛化能力。基于Twins网络和简单编码方案,在两组正负比为1∶10的非平衡数据集下,其独立测试集上均取得了较好的结果,马修斯相关系数(MCC)分别得到0.53和0.545。同时,为增强m^(6)AmTwins模型的鲁棒性(robustness),本文在训练集上还进行了10折交叉验证,其MCC结果分别为0.562和0.567,说明该模型具有良好的泛化能力,可为生物医学在m^(6)Am上的研究提供一定的价值。N6,2′-O-dimethyladenosine(m^(6)A_(m))is a common reversible modification of RNA molecules.Some studies have explained the effect of m^(6)A_(m)on mRNA,but the biological function of m^(6)A_(m)is not explored enough yet at this stage.Here we propose m^(6)A_(m)Twins,a new end-to-end Twins network that organically combines Transformer and Bidirectionally Gate Recurrent Unit(Bi-GRU)to simply obtain RNA detectability using RNA sequences.Compared with the existing algorithms,we highlight the use of contrastive learning to construct a new loss function to train the m^(6)A_(m)Twins model,which improves the generalization ability of the model.Based on the Twins network and the simple coding scheme,under the two sets of unbalanced datasets with a positive and negative ratio of 1∶10,the independent test sets achieved good results,and the Matthews correlation coefficient(MCC)was 0.53 and 0.545,respectively.Meanwhile,in order to enhance the robustness of the m^(6)A_(m)Twins model,this paper performs a 10-fold cross-verification on the training sets,and the MCC results of the two datasets are 0.562 and 0.567,respectively.In sum,the model has good generalization,which can provide a certain value for the research on m^(6)A_(m)in the biomedical area.
关 键 词:N6 2′-O-二甲基腺苷 特征提取 深度学习 双胞胎网络
分 类 号:R318.04[医药卫生—生物医学工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49