Cocrystal virtual screening based on the XGBoost machine learning model  

在线阅读下载全文

作  者:Dezhi Yang Li Wang Penghui Yuan Qi An Bin Su Mingchao Yu Ting Chen Kun Hu Li Zhang Yang Lu Guanhua Du 

机构地区:[1]Beijing City Key Laboratory of Polymorphic Drugs,Center of Pharmaceutical Polymorphs,Institute of Materia Medica,Chinese Academy of Medical Sciences and Peking Union Medical College,Beijing 100050,China [2]Shandong Soteria Pharmaceutical Co.,Ltd.,Laiwu 271100,China [3]Beijing City Key Laboratory of Drug Target and Screening Research,National Center for Pharmaceutical Screening,Institute of Materia Medica,Chinese Academy of Medical Sciences and Peking Union Medical College,Beijing 100050,China

出  处:《Chinese Chemical Letters》2023年第8期398-403,共6页中国化学快报(英文版)

基  金:The authors acknowledge the National Natural Science Foundation of China(No.22278443);CAMS Innovation Fund for Medical Sciences(No.2022-I2M-1-015);the Key R&D Program of Shan Dong Province(No.2019JZZY020909);the Xinjiang Uygur Autonomous Region Innovation Environment Construction Special Fund and Technology Innovation Base Construction Key Laboratory Open Project(No.2022D04016)for the financial support.

摘  要:Co-crystal formation can improve the physicochemical properties of a compound,thus enhancing its druggability.Therefore,artificial intelligence-based co-crystal virtual screening in the early stage of drug development has attracted extensive attention from researchers.However,the complexity of developing and applying algorithms hinders it wide application.This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model of the scikit-learn package.The simplified molecular input line entry specification(SMILES)information of two compounds is simply inputted to determine whether a co-crystal can be formed.The data set includs the co-crystal records presented in the Cambridge Structural Database(CSD)and the records of no co-crystal formation from extant literature and experiments.RDKit molecular descriptors are adopted as the features of a compound in the data set.The developed model shows excellent performance in the proposed co-crystal training and validation sets with high accuracy,sensitivity,and F1 score.The prediction success rate of the model exceeds 90%.The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately.

关 键 词:COCRYSTAL Machine learning XGBoost Molecular descriptor PRAZIQUANTEL NEFIRACETAM 

分 类 号:TQ460.1[化学工程—制药化工] TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象