基于类不平衡学习的蛋白质与金属离子交互位点预测  被引量:1

Protein-metal-ion interaction sites prediction based on class imbalance learning

在线阅读下载全文

作  者:乔梁 谢冬青 Qiao Liang;Xie Dongqing(School of Mathematics and Information Science,Guangzhou University,Guangzhou 510006,China)

机构地区:[1]广州大学数学与信息科学学院,广东广州510006

出  处:《南京理工大学学报》2018年第6期707-715,共9页Journal of Nanjing University of Science and Technology

基  金:国家自然科学基金(61772007)

摘  要:为了提高蛋白质与金属离子的交互位点(PMIIS)预测的准确率,从解决数据分布不平衡问题出发,提出了1种结合下采样与上采样方法的类不平衡学习算法。同时对多数类样本与少数类样本进行采样,在补充少数类样本信息的同时,减少多数类样本的冗余信息。基于该文类不平衡学习算法与支持向量机(SVM),设计了1种基于序列信息的预测方法。为了客观评价PMIIS的预测性能,构建了领域内较为完备的、含有蛋白质与Zn^(2+)、Ca^(2+)与Fe^(3+)交互位点的标准数据集。在此数据集上的实验结果表明,该文预测方法在蛋白质与Zn^(2+)、Ca^(2+)与Fe^(3+)交互位点预测问题上的平均马氏相关系数(MCC)为0.646,优于TargetS与IonCom。A new class imbalance learning algorithm combining the under-sampling and over-sampling methods is proposed to relieve the problem of data imbalance distribution and improve the prediction performance of protein-metal-ion interaction sites(PMIIS).The majority and minority samples are sampled at the same time,the information of the minority samples is complemented,and the redundant information of the majority samples is reduced.A new sequence-based prediction method is designed based on the new class imbalance learning algorithm and support vector machine(SVM)algorithm.A relatively complete standard dataset including the interaction sites of protein-Zn^2+,protein-Ca^2+and protein-Fe^3+is constructed to objectively evaluate the performance of PMIIS prediction.The experimental results of the dataset show that,the average Matthew's correlation coefficients(MCC)of the proposed method is 0.646 on protein-Zn^2+,protein-Ca^2+and protein-Fe^3+interaction site predictions,which is better than that of TargetS and IonCom.

关 键 词:类不平衡学习 蛋白质与金属离子 交互位点 预测 支持向量机 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象