基于集成学习算法构建有机化学品鱼体生物富集因子的QSAR预测模型  被引量:12

Using ensemble learning algorithms to develop QSAR models on bioconcentration factors of organic chemicals in multispecies fish

在线阅读下载全文

作  者:丁蕊 陈景文[1] 于洋 林军 王中钰 唐伟豪 李雪花[1] Rui DING;Jingwen CHEN;Yang YU;Jun LIN;Zhongyu WANG;Weihao TANG;Xuehua LI(Key Laboratory of Industrial Ecology and Environmental Engineering(Ministry of Education),Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology,School of Environmental Science and Technology,Dalian University of Technology,Dalian,116024,China;Solid Waste and Chemicals Management Center,Ministry of Ecology and Environment,Beijing,100029,China)

机构地区:[1]工业生态与环境工程教育部重点实验室,大连市化学品风险防控及污染防治技术重点实验室,大连理工大学环境学院,大连116024 [2]生态环境部固体废物与化学品管理技术中心,北京100029

出  处:《环境化学》2021年第5期1295-1304,共10页Environmental Chemistry

基  金:国家重点研究发展计划(2018YFC1801604,2018YFE0110700);国家自然科学基金(21661142001)资助。

摘  要:生物富集因子(BCF)是评价化学品生物累积能力的重要参数。目前全球市场上使用的化学品数量已超过了35万种,但是只有一千多种化学品具有BCF值。定量构效关系(QSAR)模型被认为是一种有效填补数据空缺的方法。目前大多数预测BCF的QSAR模型为单一模型,而集成模型可能会对BCF的预测效果有所改进。本研究建立了一个全面的鱼类BCF数据库,涵盖1300多种有机化学品的BCF实测值。基于此数据库,依据QSAR模型构建和验证导则,使用多种机器学习算法建立了预测鱼类BCF的5种单一模型和11种集成模型。结果表明,与单一模型相比,集成模型具有更好的拟合能力、稳健性、预测准确性以及更广泛的应用域。进一步使用最优集成模型对《中国现有化学物质清单》(IECSC)中化学物质的BCF进行了预测,结果表明该清单中有1066种化学物质具有生物累积性,86种化学物质具有强生物累积性。本研究所构建的模型可为化学品生物累积能力评估提供必要数据,支持化学品风险评价与管理工作。Bioconcentration factor(BCF)is a key parameter characterizing bioaccumulation of chemicals in organisms.Nevertheless,only around one thousand chemicals have BCF values,in contrast to over 350000 chemicals that have been registered for production and application in the global market.Quantitative structure-activity relationship(QSAR)models are regarded as an efficient method to fill the data gap.However,majority of QSAR models on BCF are individual models,while ensemble models may have improved capabilities on BCF prediction.In this study,a comprehensive fish BCF database was constructed,covering empirical BCF values of more than 1300 organic chemicals.Based on the database,5 individual QSAR models and 11 ensemble models were developed on BCF of organic compounds in fish using machine learning algorithms,following the guidelines on development and validation of QSARs proposed by the OECD.Results show the ensemble models have better goodness-of-fit,robustness,predictability and wider application domain than the individual models.The optimum ensemble model was further employed to predict BCF for chemicals in the inventory of existing chemical substances of China(IECSC),showing that 1066 chemicals in the inventory are bioaccumulative,and 86 chemicals are very bioaccumulative.The models can provide necessary data for evaluating the bioaccumulation capacity of chemicals and support sound chemicals management.

关 键 词:生物富集因子 定量构效关系 机器学习 集成模型 应用域 

分 类 号:X171.5[环境科学与工程—环境科学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象