一种基于Aho-Corasick算法改进的多模式匹配算法  被引量:16

An improved multi-pattern matching algorithm based on Aho-Corasick algorithm

在线阅读下载全文

作  者:陈永杰[1] 吾守尔.斯拉木 于清[1] CHEN Yongjie;Wushour Silamu;YU Qing(School of Information Science and Engineering,Xinjiang University,Urumqi 830046,China)

机构地区:[1]新疆大学信息科学与工程学院,新疆乌鲁木齐830046

出  处:《现代电子技术》2019年第4期89-93,共5页Modern Electronics Technique

基  金:国家"973"重点基础研究计划(2014CB340506)~~

摘  要:目前互联网中以文本存在的数据非常庞大,针对在如此庞大的文本中如何准确、快速地找到多个不同的目标字符串的问题,在介绍常见的模式匹配算法的优点和缺点基础上,结合Trie速多模式匹配算法。根据对比性实验的结果分析得出,改进AC且匹配速度大约是AC算法的5倍。There exists a large amount of text data on the Internet currently.In allusion to the problem that how to search out multiple different target character strings accurately and quickly in such large text,an improved fast multi-pattern matching algorithm is proposed on the basis of introducing the advantages and disadvantages of common pattern matching algorithms,and combining with the idea of converting the Trie tree into the double array form.A comparison experiment was carried out.The analysis results show that the improved AC algorithm can successfully match all the to-be queried pattern strings in the text,and its matching speed is about 5 times of that of the AC algorithm,which shows that the improved AC algorithm has good effects in aspects of matching speed,recall ratio and space utilization rate.

关 键 词:字符串匹配 多模式匹配 TRIE树 双数组 AC算法 匹配速度 

分 类 号:TN919-34[电子电信—通信与信息系统] TP301.6[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象