基于序列特征的人类Pol Ⅱ启动子理论预测  

Predicting Human Pol Ⅱ Promoter Based on Sequence Features

在线阅读下载全文

作  者:杨科利[1] 许强[1] 

机构地区:[1]宝鸡文理学院物理系,中国陕西宝鸡721007

出  处:《生命科学研究》2009年第5期403-407,共5页Life Science Research

基  金:宝鸡文理学院硕士启动项目(08113)

摘  要:基于已知的人类Pol Ⅱ启动子序列数据,综合选取启动子序列内容和序列信号特征,构建启动子的支持向量机分类器.分别以启动子序列的6-mer频数作为离散源参数构建序列内容特征,同时选取24个位点的3-mer频数作为序列信号特征构建PWM,将所得到的两类参数输入支持向量机对人类启动子进行预测.用10折叠交叉检验和独立数据集来衡量算法的预测能力,相关系数指标达到95%以上,结果显示结合了支持向量机的离散增量算法能够有效的提高预测成功率,是进行真核生物启动子预测的一种很有效的方法.Based on the six least increment diversity, three kinds of position weight matrix, and the percent of GC in the sequences, the content vectors and the signals vector were distilled from the promoter sequences. The vectors calculated were input into a support vector machine (SVM) algorithm to build a promoter classification model. The human Pol II promoter sequences are predicted by using of support vector machine, the 10-fold cross-validation and the independent test data were used for validating the support vector machine model. The results showed that the overall prediction accuracies (sensitivity) and specificity were more than 95%. These results indicated that the increment of diversity and support vector machines algorithm is an effective method for predicting the Eukaryotic promoter sequences.

关 键 词:启动子 离散增量 位置权重矩阵(PWM) 支持向量机(SVM) 

分 类 号:Q61[生物学—生物物理学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象