Generic Interpretable Reaction Condition Predictions with Open Reaction Condition Datasets and Unsupervised Learning of Reaction Center  

在线阅读下载全文

作  者:Xiaorui Wang Chang-Yu Hsieh Xiaodan Yin Jike Wang Yuquan Li Yafeng Deng Dejun Jiang Zhenxing Wu Hongyan Du Hongming Chen Yun Li Huanxiang Liu Yuwei Wang Pei Luo Tingjun Hou Xiaojun Yao 

机构地区:[1]Dr.Neher's Biophysics Laboratory for lnnovative Drug Discovery,State Key Laboratory of Quality Research in Chinese Medicine,Macao Institute for Applied Research in Medicine and Health,Macao University of Science and Technology,Macao,999078,China [2]innovation Institute for Artificial Intelligence in Medicine of Zhejiang University,College of Pharmaceutical Sciences,Zhejiang University,Hangzhou,310058,China [3]Faculty of Applied Sciences,Macao Polytechnic University,Macao,999078,China [4]College of Chemistry and Chemical Engineering,Lanzhou University,Lanzhou,730000,China [5]Carbon Silicon AI Technology Co.,Ltd,Hangzhou,Zhejiang 310018,China [6]Center of Chemistry and Chemical Biology,Guangzhou Regenerative Medicine and Health Guangdong Laboratory,Guangzhou 510530,China [7]College of Pharmacy,Shaanxi University of Chinese Medicine,Xianyang,Shaanxi,712044,China

出  处:《Research》2024年第2期773-794,共22页研究(英文)

基  金:funded by the Science and Technology Development Fund,Macao SAR(File no.0056/2020/AMJ,0114/2020/A3,0015/2019/AMJ);Dr.Neher's Biophysics Laboratory for Innovative Drug Discovery,State Key Laboratory of Quality Research in Chinese Medicine,Macao University of Science and Technology,Macao,China(001/2020/ALC).

摘  要:Effective synthesis planning powered by deep learning(DL)can significantly accelerate the discovery of new drugs and materials.However,most DL-assisted synthesis planning methods offer either none or very limited capability to recommend suitable reaction conditions(RCs)for their reaction predictions.Currently,the prediction of RCs with a DL framework is hindered by several factors,including:(a)lack of a standardized dataset for benchmarking,(b)lack of a general prediction model with powerful representation,and(c)lack of interpretability.To address these issues,we first created 2 standardized RC datasets covering a broad range of reaction classes and then proposed a powerful and interpretable Transformer-based RC predictor named Parrot.Through careful design of the model architecture,pretraining method,and training strategy,Parrot improved the overall top-3 prediction accuracy on catalysis,solvents,and other reagents by as much as 13.44%,compared to the best previous model on a newly curated dataset.Additionally,the mean absolute error of the predicted temperatures was reduced by about 4℃.Furthermore,Parrot manifests strong generalization capacity with superior cross-chemical-space prediction accuracy.Attention analysis indicates that Parrot effectively captures crucial chemical information and exhibits a high level of interpretability in the prediction of RCs.The proposed model Parrot exemplifies how modern neural network architecture when appropriately pretrained can be versatile in making reliable,generalizable,and interpretable recommendation for RCs even when the underlying training dataset may still be limited in diversity.

关 键 词:VERSATILE ABSOLUTE GENERALIZATION 

分 类 号:O62[理学—有机化学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象