隐Markov模型在剪接位点识别中的应用  被引量:9

Application of hidden Markov model in the recognition of splicing sites

在线阅读下载全文

作  者:夏慧煜[1] 周晴[1] 李衍达[1] 

机构地区:[1]清华大学自动化系智能技术与系统国家重点实验室,北京100084

出  处:《清华大学学报(自然科学版)》2002年第9期1214-1217,共4页Journal of Tsinghua University(Science and Technology)

基  金:国家自然科学基金资助项目 (6993 5 0 2 0 )

摘  要:剪接位点的识别是基因识别中的一个重要环节。由于现有的基因识别算法主要关注编码区的整体特性 ,而并不着重考虑个别位点的信息 ,因此难以准确地识别出剪接位点。考虑到剪接位点附近的保守序列的相邻碱基之间应该存在某种相关性 ,利用一阶 Markov链建立了表述这种相关性的模型 ,在此基础之上 ,设计了专门用于剪接拉点识别的隐马氏模型 (HMM)方法。实验结果表明 ,用 HMM描述剪接位点附近序列符合实际情况 ,并且利用这一方法进行剪接位点的识别可以很好地提取位点附近保守序列在边缘分布与条件分布 (转移概率 )上的统计特征。使用该方法对真实剪接位点和虚假剪接位点进行识别 ,识别率均可达 90 %以上。The recognition of splicing sites is an important step in gene recognition. Since current gene recognition algorithms are mainly considering the global features of coding area, instead of the specific information of the splicing sites, they are usually unable to recognize the splicing sites accurately. Considering that neighboring base pairs of the conserved sequences around splicing sites have some correlations, one order Markov chain was used to model the correlation. Based on this model, a special hidden Markov method for recognition of splicing sites was built. Experimental results show that the description of conserved sequences around splicing sites by HMM is well fit to reality. And the method is good at retrieving the statistical characteristics of the marginal and conditional distribution (transition probabilities) of the conserved sequences. Applying the method to recognize both the true and false splicing sites, the recognition rates are greater than 90%.

关 键 词:隐MARKOV模型 剪接位点 基因识别 基因编码区 序列长度 保守序列 概率统计方法 

分 类 号:Q523.8[生物学—生物化学] Q-332

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象