基于深度卷积神经网络识别可变剪接位点  

Identifying Alternative Splicing Sites Based on Deep Convolutional Neural Networks

在线阅读下载全文

作  者:张雨晨[1] 杨立桃[1] 王灿华[1] Zhang Yuchen;Yang Litao;Wang Canhua(School of Life Sciences and Biotechnology,Shanghai Jiao Tong University,Shanghai,200240)

机构地区:[1]上海交通大学生命科学技术学院

出  处:《基因组学与应用生物学》2019年第11期4986-4991,共6页Genomics and Applied Biology

基  金:转基因作物新品种培育重大专项(2016ZX08012-003)资助

摘  要:可变剪接源于多外显子基因生成多个转录本的调控过程。随着高通量测序,尤其是RNA-seq的研究进展,剪接序列和剪接位点可以通过挖掘海量的测序数据进行预测。可变剪接现象拓宽了人们对基因结构和蛋白质亚型的知识。然而现有的短序列比对软件受到随机性比对的影响,产生很多假阳性剪接位点,干扰下游数据分析。本研究发现,可变剪接位点周边序列的结构特征可被深度学习模型提取,并利用深度卷积神经网络识别剪接位点。本研究的模型具有识别率高、计算速度快,模型泛化能力强、鲁棒性高等优势。Alternative splicing occurs during the regulation and control process of multiple mRNA transcripted from multi-exon gene.With the advance of Next-generation Sequencing technology,especially that of the RNA-seq,splice junction and splice sites can be inferred by mining enormous sequencing data Alternative splicing has broadened the knowledge of the gene structure and the diversity of protein variants of humankind.However,the present short sequencing reads aligners are limited by random alignments,causing a lot of false positive splice sites and disturbing downstream analysis.In this work,we presented that the characteristics of sequences around splice sites can be extracted by deep learning model and used deep convolutional neural networks to classify splice sites.Our model showed the advantages of high detecting accuracy,high computational efficiency and high generalization and robustness.

关 键 词:深度学习 可变剪接 卷积神经网络 RNA高通量测序 

分 类 号:Q811.4[生物学—生物工程] TP183[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象