检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Junhai Qi Chenjie Feng Yulin Shi Jianyi Yang Fa Zhang Guojun Li Renmin Han
机构地区:[1]Research Center for Mathematics and Interdisciplinary Sciences,Shandong University,Qingdao 266237,China [2]BioMap Research,Menlo Park,CA 94025,USA [3]College of Medical Information and Engineering,Ningxia Medical University,Yinchuan 750004,China [4]Institute of Engineering Medicine,Beijing Institute of Technology,Beijing 100081,China
出 处:《Genomics, Proteomics & Bioinformatics》2024年第1期111-119,共9页基因组蛋白质组与生物信息学报(英文版)
基 金:supported by the National Key R&D Program of China(Grant Nos.2021YFF0704300 and 2020YFA0712400);the National Natural Science Foundation of China(Grant Nos.62072280,61771009,61932018,62072441,32241027,and T2225007);the open project of BGI-Shenzhen,Shenzhen 518000,China(Grant No.BGIRSZ20220005);the Natural Science Foundation of Ningxia Province,China(Grant No.2023AAC05036).
摘 要:The release of AlphaFold2 has sparked a rapid expansion in protein model databases.Efficient protein structure retrieval is crucial for the analysis of structure models,while measuring the similarity between structures is the key challenge in structural retrieval.Although existing structure alignment algorithms can address this challenge,they are often time-consuming.Currently,the state-of-the-art approach involves converting protein structures into three-dimensional(3D)Zernike descriptors and assessing similarity using Euclidean distance.However,the methods for computing 3D Zernike descriptors mainly rely on structural surfaces and are predominantly web-based,thus limiting their application in studying custom datasets.To overcome this limitation,we developed FP-Zernike,a user-friendly toolkit for computing different types of Zernike descriptors based on feature points.Users simply need to enter a single line of command to calculate the Zernike descriptors of all structures in customized datasets.FP-Zernike outperforms the leading method in terms of retrieval accuracy and binary classification accuracy across diverse benchmark datasets.In addition,we showed the application of FP-Zernike in the construction of the descriptor database and the protocol used for the Protein Data Bank(PDB)dataset to facilitate the local deployment of this tool for interested readers.Our demonstration contained 590,685 structures,and at this scale,our system required only 4-9 s to complete a retrieval.The experiments confirmed that it achieved the state-of-the-art accuracy level.FP-Zernike is an open-source toolkit,with the source code and related data accessible at https://ngdc.cncb.ac.cn/biocode/tools/BT007365/releases/0.1,as well as through a webserver at http://www.structbioinfo.cn/.
关 键 词:Zernike descriptor Structure alignment PDB dataset OPEN-SOURCE Retrieval system
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.145.116.170