基于随机映射的气相色谱-质谱库搜索结果集提取  

Extraction of Result Set for Gas Chromatography-Mass Spectrometry Database Search Based on Random Mapping

在线阅读下载全文

作  者:蒋丽红 周念 彭莉莉 陈鹏 章军[3] JIANG Lihong;ZHOU Nian;PENG Lili;CHEN Peng;ZHANG Jun(School of electronics and Information Engineering,Tongji University,Shanghai 201804,China;Institute of Physical Science and Information Technology,Anhui University,Hefei 230601,China;School of Electronic Engineering&Automation,Anhui University,Hefei 230601,China)

机构地区:[1]同济大学电子与信息工程学院,上海201804 [2]安徽大学物质科学与信息技术研究院,合肥230601 [3]安徽大学电气工程与自动化学院,合肥230601

出  处:《安徽工业大学学报(自然科学版)》2017年第4期373-378,共6页Journal of Anhui University of Technology(Natural Science)

基  金:国家自然科学基金项目(61472282)

摘  要:作为一种快速实现质谱分子匹配的方法,基于随机映射的质谱库搜索方法选取前几个匹配相似度最高的候选分子组成结果集,但由于缺乏准确的阈值设定依据,该方法容易丢失部分正确结果,致使识别率降低。针对该问题,采用统计学方法对随机映射质谱库搜索方法的结果集进行分析,发现:在匹配成功分子中,有96.60%的匹配相似度大于0.85;在非最高相似度匹配成功的分子中,有97.19%其所对应的相似度与最高相似度的差值不大于0.07。基于此,改进现有的基于随机映射质谱库搜索方法,提出一种更为精准的动态截取结果集提取法。实验结果表明:提出的方法可将现有方法的识别率提高1.89%,平均匹配准确率达98.60%,从而使分子的定性识别更为准确;算法的稳健性进一步提高。As a fast method of matching the mass spectral of target molecules with that of standard compounds,random mapping-based mass spectral library searching algorithms select some candidate molecules with top high similarity values to make up a candidate set.However,due to the lack of accurate threshold setting basis,this strategy is easy to miss some correct results,which will definitely reduce the identification accuracy.In order to address this problem,statistical methods were adopted in this paper to analyze the candidate set of the original method,and the experiment results show that,the similarity values of 96.60%of query molecules are greater than 0.85 if the correct matching can be got;among those correct matching is not occur with the highest similarity,97.19%of the similarity difference between the matched molecule and the highest one are less than 0.07.Based on these findings,an accurate candidate set extraction algorithm,called dynamic interception algorithm,is proposed in this paper by improving the current random mapping-based mass spectral library searching approach.The experimental results show that the proposed method can increase the identification accuracy of the existing method by 1.89%,and the average value reaches 98.60%,hence the molecules can be identified more accurately,and the robustness of the algorithm is improved.

关 键 词:气相色谱-质谱 随机映射 相似性度量 结果集 

分 类 号:Q81[生物学—生物工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象