应用新的基于知识编码方法及双层SVM识别人类PolⅡ启动子  被引量:2

Applying novel knowledge-based encoding methods and dual SVM to human Pol Ⅱ promoter recognition

在线阅读下载全文

作  者:智慧[1] 李通化[2] 

机构地区:[1]哈尔滨医科大学生物信息科学与技术学院生物信息教研室,黑龙江哈尔滨150081 [2]同济大学化学系,上海200092

出  处:《哈尔滨医科大学学报》2012年第1期23-26,共4页Journal of Harbin Medical University

摘  要:目的优选对人类RNA聚合酶(Pol)Ⅱ启动子数据识别分类并提高识别准确率的方法。方法采用基于知识的统计编码方法、CpG编码、五联体(Pentamers)编码、模式字典(Pattern Dictionary)编码,最后建立共识模型,使用支持向量机(SVM)方法对启动子数据进行分类。结果启动子数据编码后在SVM中识别,与其他利用SVM工具相比,得到了较高的识别准确率、特异性及灵敏度。将新编码方法应用到人类22号染色体启动子数据的识别中,其中模式字典编码识别准确率达到了90.98%。结论共识模型考虑了各子模型的独立性和模型之间的差异性,发挥了各模型之间的互补优势,从而提高了最终的识别准确率。Objective To recognize human PolⅡpromoter, and select a better coding method with highly promoted recognition precision. Methods Novel encoding methods were applied to encoding of the human promoter sequences, including statistical code, CpG code, Pentamers code, and Pattern Dictionary code, fight consensus models were built up, and the promoter se- quences with the Support Vector Machine (SVM) were recognized. Results The recognition accuracy, sensitivities and specificities had precedence. The accuracy of the human chromo- some 22 promoter recognition reached 90.98%. Conclusion The consensus models include the independence and difference of each sub-models, and exert the superiorities and the com- plementarities of the sub-models.

关 键 词:启动子识别 支持向量机 共识模型 双层SVM 生物统计学 

分 类 号:Q81[生物学—生物工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象