检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:胡笑文 杨建波[1] 魏锋[1] 马双成[1] HU Xiaowen;YANG Jianbo;WEI Feng;MA Shuangcheng(National Institutes for Food and Drug Control,Beijing 102629,China)
出 处:《中国药物警戒》2022年第4期390-394,共5页Chinese Journal of Pharmacovigilance
基 金:重大新药创制国家科技重大专项2018年度(2018ZX09735006);国家自然科学基金资助项目(81773874、81973476)。
摘 要:目的对何首乌中的天然产物进行聚类分析,建立一种较为科学的天然产物聚类方法,为后续化合物挑选、药理筛选提供技术指导。方法从文献中收集并整理何首乌天然产物,选择二苯乙烯类、蒽醌类等主要类别化合物作为聚类对象,转换为简化分子线性输入规范(SMILES),并使用rdkit提取化合物的扩展连通性指纹和理化性质作为特征,经过方差筛选得到有效的特征。使用谱聚类算法,对何首乌天然产物进行聚类,以Calinski Harabaz(CH)指数作为评估指标,优化聚类参数。采用优化后最佳参数对化合物进行聚类,分析各类别的特点。随后对3种主要类别的化合物进行主成分分析,查看主要类别的空间分布。最后对主要类别化合物分别计算脂水分配系数和拓扑极性表面积,分析性质分布,验证聚类合理性。结果从文献中挑选13个类别的123个何首乌天然产物。经过特征提取和过滤,共得到207个特征。CH指数表明聚类数量为10,γ为0.004时聚类效果最佳。主成分分析显示3个主要成分组在空间中各自成簇,无重叠情况发生。经过聚类后,脂水分配系数和拓扑极性表面积2个指标倾向更加集中。结论谱聚类算法不仅能够区分何首乌天然产物中差异较大的化合物,也能较好地对复杂化合物进行聚类,聚类结果具有一定的合理性,能够为传统药理筛选提供新的思路。Objective To establish a proper method for clustering natural products by using the spectral clustering algorithm and compounds derived from Polygonum multiflorum Thunb.Methods Major categories of compounds including stilbenes and anthraquinones that originated from Polygonum multiflorum Thunb.were collected from the literature and converted into the simplified molecular input line entry specification(SMILES).Extended-Connectivity Fingerprints and physicochemical properties were extracted and filtered by variance before the spectral clustering algorithm was used for clustering.The Calinski Harabaz(CH)score was employed for the parameter optimization of the spectral cluster.The optimal method was applied to the natural products and the features of each class were analyzed.Principal component analysis of the three main categories was carried out to visualize the spatial distribution.Finally,the topological polar surface area(TPSA)and lipid-water partition coefficient(LogP)of the main compounds were calculated,and the feature distribution of the properties was analyzed.Results A total of 123 natural products of thirteen categories were collected from the literature.After feature calculation and removal of features with near-zero variance,207 valid features were obtained.The spectral clustering algorithm achieved the highest CH score when the number of clusters was set at 10 andγset at 0.004.Principal component analysis showed that three major classes were clustered individually in 3-dimentional space.Besides,and that the distribution of TPSA and LogP tended to be centralized.Conclusion The spectral clustering algorithm can not only distinguish the compounds with unique structures,but also have a better performance for complex compounds in Polygonum multiflorum.These results provide novel ideas for screening of natural products.
分 类 号:R917[医药卫生—药物分析学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.40