检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]哈尔滨工程大学信息安全研究中心,黑龙江哈尔滨150001
出 处:《通信学报》2009年第3期119-124,共6页Journal on Communications
基 金:国家重点基础研究发展计划("973"计划)基金资助项目(2007CB311100);国家高技术研究发展计划("863"计划)基金资助项目(2007AA01Z473)~~
摘 要:提出了紧缩存储型Aho-Corasick算法变体,以异构的按需隐式存储取代同构的例行显式存储,从横向扇出压缩与纵向路径压缩2个方向入手,围绕着压缩稀疏事件表展开,当字符集大小σ=256时可将存储量缩减为原来的0.69%左右,而σ=64K时则达0.004%,即空间复杂度降为原来的(lbσ)/σ左右。依据扇出疏密程度的不同,分类采用了4种有针对性的快速事件定位方法,加之优化的失败迁移,使得存储量的大幅缩减不以速度的明显损失为代价,实验也证实了这一点。适用于需承载大型模式集和较长模式串而对时延和抖动都比较敏感的场合(如在线数据流过滤),在宽字符(如UNICODE型亚洲字符)匹配方面拥有显著优势。A variation of Aho-Corasick algorithm via compact storage was presented, which replaced homogeneous routine explicit storage with heterogeneous requirement oriented implicit one, first started from two aspects of widthwise fan-out compression and lengthwise path compression, then expanded around the compression of sparse event table, thus reduced memory usage to about 0.69% of original one when alphabet size a = 256, 0.004% when σ = 64K, and space complexity approximate (1bσ)/σ of origin. According to different degrees of fan-out, four kinds of specific fast event location methods were adopted, plus optimized failure transitions, led to the fact that the dramatic reduction of memory usage isn't at the cost of obvious loss of speed, which was also proved by experiments. It's suitable to be applied in the cases that need holding mass set of longer patterns and are sensitive to delay and jitter(for example, online data stream filtering) and presents distinguished superiority in wide-character(such as Asian character of UNICODE type) matching.
关 键 词:多模式匹配 紧缩存储 扇出压缩 路径压缩 事件定位
分 类 号:TP301[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117