基于K最近邻与K均值聚类法的样本分类方法用于有机物的定量构效关系建模——对有机物致敏性及其极性参数的研究  被引量:2

QSPR study on the property of organic compounds based on the classification of the samples by KNN and K-means clustering algorithm——or the sensitization and the polarity parameters of some organic compounds

在线阅读下载全文

作  者:张雅雄[1] 王静[1] 

机构地区:[1]山西师范大学化学与材料科学学院,山西临汾041000

出  处:《计算机与应用化学》2016年第12期1295-1300,共6页Computers and Applied Chemistry

基  金:山西省留学回国人员项目(2014-045);山西省自然科学基金项目(2010011013-2);山西师大教改项目(SD2013JGXM-54)

摘  要:本文选取了部分有机物致敏性和部分有机物极性参数两组数据,均采用ADMEWORKS ModelBuilder软件计算并选择出合适的结构描述符,进而采用K最近邻和K均值聚类法对两组数据进行分类,然后对分类后的数据分别运用多元线性回归(Multiple Linear Regression,MLR)、偏最小二乘(Partial Least Squares,PLS)和人工神经网络(Artificial Neural Networks,ANN)方法进行QSPR建模研究。结果表明,无论采用何种分类方法都可以在一定程度上改善模型预测的结果。对于两组样本,有机物分子结构差异较小的样本集模型预测结果较优,非线性模型的预测结果整体优于或相当于线性模型的预测结果。In this work, the authors selected two groups of data on organic compounds. One is the sensitization data of some organic compounds, and the other is the polarity parameters of some organic compounds. The proper molecular descriptors of the two groups of the organic compounds were calculated and selected applying the ADMEWORKS ModelBuilder software. Then, using K-nearest neighbor (KNN) and K-means clustering method, the samples of the two data sets were classified respectively. Subsequently, on each of the classified data sets, QSPR models were developed using multiple linear regression (MLR), partial least squares (PLS) and artificial neural networks (ANN), respectively. The research showed that both of the pattern recognition methods applied can improve the predic- tion results of the QSPR models to some extent. Moreover, it was also discovered that better QSPR results can be generated on the data set with higher structural similarity. However, on the whole, non-linear QSPR models can give better (or comparative) prediction results than (to) those of the linear ones.

关 键 词:致敏性 极性参数 定量结构性质相关 

分 类 号:TQ015.9[化学工程] TP391.9[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象