检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈彬[1] 饶含兵[1] 何桦[2] 杨国兵[3] 李泽荣[4]
机构地区:[1]四川农业大学生命科学与理学院,四川雅安625014 [2]四川农业大学动物遗传育种研究所,四川雅安625014 [3]四川大学化学工程学院,四川成都610065 [4]四川大学化学学院,四川成都610064
出 处:《化学研究与应用》2011年第12期1577-1584,共8页Chemical Research and Application
基 金:四川农业大学本科论文培育基金(No:00709062)和四川农业大学双支计划(No:00770117)资助
摘 要:对激素敏感脂肪酶,我们构建了表征分子组成、电荷、拓扑、几何结构及物理化学性质等特征的1559个描述符,通过Fischer Score排序过滤和Monte Carlo模拟退火法相结合进行变量筛选得到35个描述符,然后分别用支持向量学习机(SVM)、人工神经网络(ANN),k-近邻(k-NN),连续核密度估计(CKD)和逻辑回归(LR)等机器学习方法建立了激素敏感脂肪酶抑制剂的分类预测模型。对于训练集的200个样本,通过五重交叉验证,各机器学习方法对正样本,负样本和总样本的平均预测精度分别在78.0%-94.0%,69.0%-91.0%和73.5%-92.5%;通过y-scrambling方法验证SVM模型是否偶然相关,结果正样本,负样本和总样本的平均预测精度分别在60.0%-74.0%,58.0%-71.0%和61.0%-69.5%,明显低于实际所建模型的预测精度,表明所建模型不存在偶然相关;对52个没有参与建模的外部独立测试样本,各机器学习方法对正样本,负样本和总样本的预测精度分别在84.6%-92.3%,88.5%-92.3%和86.5%-92.3%。所建模型中,SVM,CKD和LR较好,且明显高于其他文献报道结果。A total of 1559 molecular descriptors including constitutional,charge distribution,topological,geometrical,and physicochemical descriptors were calculated to encode the hormone-sensitive lipase inhibitors.The number of 35 molecular descriptors was selected using a hybrid filter/wrapper approach combing Fischer Score and Monte Carlo simulated annealing,then classification models for hormone-sensitive lipase inhibitors were built based on support vector machine(SVM),artificial neural networks(ANN),k-nearest neighbor(k-NN),continuous kernel discrimination(CKD)and logistic regression(LR)methods.For 200 samples in training set,average prediction accuracies of 78.0%-94.0%,69.0%-91.0%and 73.5%-92.5%for positive,negative,and total samples,respectively,were obtained through 5-fold cross validation.Average prediction accuracies of 60.0%-74.0%,58.0%-71.0%and 61.0%-69.5%for positive,negative,and total samples,respectively,were obtained by using y-scrambling method,indicating that there was no chance correlation on our models.For an external test of 52 samples which were not used in models building,prediction accuracies of 84.6%-92.3%,88.5%-92.3%and 86.5%-92.3%for positive,negative,and total samples,respectively,were obtained.The prediction accuracies by all machine learning methods,especially by SVM method,in this study were far better than literature results.
关 键 词:激素敏感脂肪酶抑制剂 机器学习方法 变量筛选
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15