机器学习方法用于激素敏感脂肪酶抑制剂活性预测  被引量:3

Classification for hormone-sensitive lipase inhibitors based on machine learning methods

在线阅读下载全文

作  者:陈彬[1] 饶含兵[1] 何桦[2] 杨国兵[3] 李泽荣[4] 

机构地区:[1]四川农业大学生命科学与理学院,四川雅安625014 [2]四川农业大学动物遗传育种研究所,四川雅安625014 [3]四川大学化学工程学院,四川成都610065 [4]四川大学化学学院,四川成都610064

出  处:《化学研究与应用》2011年第12期1577-1584,共8页Chemical Research and Application

基  金:四川农业大学本科论文培育基金(No:00709062)和四川农业大学双支计划(No:00770117)资助

摘  要:对激素敏感脂肪酶,我们构建了表征分子组成、电荷、拓扑、几何结构及物理化学性质等特征的1559个描述符,通过Fischer Score排序过滤和Monte Carlo模拟退火法相结合进行变量筛选得到35个描述符,然后分别用支持向量学习机(SVM)、人工神经网络(ANN),k-近邻(k-NN),连续核密度估计(CKD)和逻辑回归(LR)等机器学习方法建立了激素敏感脂肪酶抑制剂的分类预测模型。对于训练集的200个样本,通过五重交叉验证,各机器学习方法对正样本,负样本和总样本的平均预测精度分别在78.0%-94.0%,69.0%-91.0%和73.5%-92.5%;通过y-scrambling方法验证SVM模型是否偶然相关,结果正样本,负样本和总样本的平均预测精度分别在60.0%-74.0%,58.0%-71.0%和61.0%-69.5%,明显低于实际所建模型的预测精度,表明所建模型不存在偶然相关;对52个没有参与建模的外部独立测试样本,各机器学习方法对正样本,负样本和总样本的预测精度分别在84.6%-92.3%,88.5%-92.3%和86.5%-92.3%。所建模型中,SVM,CKD和LR较好,且明显高于其他文献报道结果。A total of 1559 molecular descriptors including constitutional,charge distribution,topological,geometrical,and physicochemical descriptors were calculated to encode the hormone-sensitive lipase inhibitors.The number of 35 molecular descriptors was selected using a hybrid filter/wrapper approach combing Fischer Score and Monte Carlo simulated annealing,then classification models for hormone-sensitive lipase inhibitors were built based on support vector machine(SVM),artificial neural networks(ANN),k-nearest neighbor(k-NN),continuous kernel discrimination(CKD)and logistic regression(LR)methods.For 200 samples in training set,average prediction accuracies of 78.0%-94.0%,69.0%-91.0%and 73.5%-92.5%for positive,negative,and total samples,respectively,were obtained through 5-fold cross validation.Average prediction accuracies of 60.0%-74.0%,58.0%-71.0%and 61.0%-69.5%for positive,negative,and total samples,respectively,were obtained by using y-scrambling method,indicating that there was no chance correlation on our models.For an external test of 52 samples which were not used in models building,prediction accuracies of 84.6%-92.3%,88.5%-92.3%and 86.5%-92.3%for positive,negative,and total samples,respectively,were obtained.The prediction accuracies by all machine learning methods,especially by SVM method,in this study were far better than literature results.

关 键 词:激素敏感脂肪酶抑制剂 机器学习方法 变量筛选 

分 类 号:O64[理学—物理化学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象