QSAR模型应用域的表征方法  被引量:7

Characterization of applicability domains for QSAR models

在线阅读下载全文

作  者:王中钰[1] 陈景文[1] 傅志强[1] 李雪花[1] Zhongyu Wang;Jingwen Chen;Zhiqiang Fu;Xuehua Li(Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology,Key Laboratory of Industrial Ecology and Environmental Engineering(Ministry of Education),School of Environmental Science and Technology,Dalian University of Technology,Dalian 116024,China)

机构地区:[1]大连理工大学环境学院,工业生态与环境工程教育部重点实验室,大连市化学品风险防控及污染防治技术重点实验室,大连116024

出  处:《科学通报》2022年第3期255-266,共12页Chinese Science Bulletin

基  金:国家重点研究发展计划(2018YFC1801604,2018YFE0110700);国家自然科学基金(21661142001)资助。

摘  要:定量构效关系(QSAR)模型是填补化学品环境安全数据空缺的重要工具.QSAR模型需要明确定义的应用域,才能合理地用于化学品管理.本文回顾了应用域的3种概念:描述符域、结构域和机理域.基于案例,重点介绍了基于分子指纹与相似性度量指标而计算结构域的方法、结构域的特点和优势.讨论了结构-活性地貌(structure-activity landscape)中呈现的活性悬崖(activity cliffs)现象及其成因.为了更好地理解描述符的适用性,解释QSAR机制及合理选择应用域的表征方法,有必要认识预测终点(endpoint)本质上所描述的系统,该系统复杂性和空间异质性,以及预测终点是否考察了系统行为的涌现.In the field of environmental science and engineering,quantitative structure-activity relationship(QSAR)means the quantitative relationship between the structure of molecules(or their aggregates e.g.,nanoparticles)and certain endpoints.Herein,endpoints generally refer to physicochemical properties,biological effects or environmental behavior parameters,etc.that can be measured or modeled.Based on data sets of chemical structures and their known endpoint values(i.e.,training set),QSAR models could,by means of specific algorithms,establish the mathematical relationships between the digital features that characterize the molecular structure(i.e.,descriptors)and the endpoint values.Then,the established mathematical relationships can be employed to predict the endpoint values for given chemical structures.QSAR models are important tools for filling the data gap in environmental safety of chemicals and addressing the issues from so-called“emerging pollutants”that are closely related to the improper management of chemicals.Notably,QSAR models are intrinsically data-driven models.The relationships presented in the training set are not necessarily applicable to arbitrary chemical structures.The reliability of QSAR models is always limited to certain applicability domains.Therefore,acceptance of QSAR models in sound management of chemicals requires clearly defined applicability domains.This study reviewed three concepts of the applicability domain:Descriptor domain,structural domain and mechanism domain.For characterizing descriptor domain,methods based on hyper-rectangle,convex hull,joint probability density estimation and various types of distances were described.Notably,when Boolean fingerprints are used as descriptors,these methods become meaningless.Thus,implementation,characters and advantages of the structural domain based on fingerprints and similarity,were specially introduced.Moreover,structure-activity landscapes(SALs),as exemplified by a network-like similarity graph(NSG)and a 3 D topography of the endpoint

关 键 词:定量构效关系(QSAR) 应用域 描述符 活性悬崖 结构-活性地貌 

分 类 号:X505[环境科学与工程—环境工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象