利用分组重量编码预测细胞凋亡蛋白的亚细胞定位  被引量:5

PREDICTION OF THE SUBCELLULAR LOCATION OF APOPTOSIS-RELATED PROTEINS WITH ENCODING BASED ON GROUPED WEIGHT FOR PROTEIN SEQUENCE

在线阅读下载全文

作  者:张振慧[1] 王正华[2] 王勇献[2] 

机构地区:[1]国防科技大学理学院数学与系统科学系,长沙410073 [2]国防科技大学计算机学院并行与分布处理国家重点实验室,长沙410073

出  处:《生物物理学报》2006年第4期275-282,共8页Acta Biophysica Sinica

摘  要:从氨基酸的物化特性出发,利用物理学中“粗粒化”和“分组”的思想,提出了一种新的蛋白质序列特征提取方法——分组重量编码方法。采用组分耦合算法作为分类器,从蛋白质一级序列出发对细胞凋亡蛋白的亚细胞定位进行研究。针对Zhou和Doctor使用的数据集,Re-substitution和Jackknife检验总体预测精度分别为98.0%和85.7%,比基于氨基酸组成和组分耦合算法的总体预测精度提高了7.2%和13.2%;针对陈颖丽和李前忠使用的数据集,Re-substitution和Jackknife检验总体预测精度分别为94.0%和80.1%,比基于二肽组成和离散增量算法的总体预测精度提高了5.9%和2.0%。针对我们自己整理的最新数据集,通过Re-substitution和Jackknife检验,总体预测精度分别为97.33%和75.11%。实验结果表明蛋白质序列的分组重量编码对于细胞凋亡蛋白的定位研究是一种有效的特征提取方法。Apoptosis-related proteins have a organism. These proteins are very important for Based on the idea of coarse-grained description central role in the development and homeostasis of an understanding the mechanism of programmed cell death. and grouping in physics, a new encoding method with grouped weight for protein sequence was presented, and was applied to apoptosis-related protein subcellular location prediction associated with component-coupled algorithm. The average rate of correct recognition were 98.0% in Re-substitution test and 85.7% in Jackknife test for standard set of 98 proteins. For the same training dataset and the same predictive algorithm, the overall predictive accuracy of our method for the Re-substitution and Jackknife test were 7.2% and 13.2% higher than the accuracy based only on the amino-acid composition. The average rate of correct recognition were 94.0% in Re-substitution test and 80.1% in Jackknife test for standard set of 151 proteins, that were 5.9 and 2.0 percentile higher than that method based on bipeptide composition and the algorithm of measure of diversity. For the new dataset we constructed, the overall prediction accuracy of Re-substitution and Jackknife test were 97.33% and 75.11% respectively. The experiment results showed that the encoding method was efficient to extract the structure information implicated in protein sequence and the method had reached a satisfied performance despite its simplicity.

关 键 词:分组重量编码 凋亡蛋白 组分耦合算法 

分 类 号:Q61[生物学—生物物理学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象