一种改进的针对中文编码的Wu-Manber多模式匹配算法  被引量:4

An Improved Wu-Manber Multi-pattern Matching Algorithm for Chinese Encoding

在线阅读下载全文

作  者:王一霈 石春[1] 戴上静[1] 吴刚[1] 

机构地区:[1]中国科学技术大学自动化系工业自动化研究所,合肥230027

出  处:《小型微型计算机系统》2015年第4期778-781,共4页Journal of Chinese Computer Systems

基  金:2012年安徽省科技攻关重点项目(12010202038)资助;新能源汽车产业技术创新工程资助

摘  要:Wu-Manber算法是多模式匹配领域性能优越的算法之一.针对Wu-Manber算法不能很好的用于中文环境,以及滑动距离受限和冗余匹配的问题,提出一种改进的针对中文编码的WM_CH多模式匹配算法.WM_CH针对中文编码修改了哈希函数,优化了建立哈希表的过程;修改并优化了算法匹配过程,在执行精确匹配时消除了冗余匹配,增大了单次精确匹配后的滑动距离.实际测试表明,该算法性能优异,保持与原算法匹配精确度一致,针对中文编码能快速过滤非中文字符.在特征串集规模大于50 000时,匹配速度比原算法提升40%以上,同时滑动窗口的跳转次数显著下降.Wu-Manber algorithm is one of the excellent solutions for multi-pattern matching problem. As the Wu-Manber algorithm is not quite suitable for Chinese encoding, besides there is shifting distance restriction and redundant matching problem, we propose an improved WMCH multi-pattern matching algorithm, especially applicable to Chinese encoding. WM_CH algorithm modifies thehash function for Chinese encoding and optimizes the process of establishing hash table;WM_CH algorithm also optimizes the matching procedure, eliminating redundant matching process and increasing the shifting distance. The WM_CH algorithm performs significantly, maintaining the same matching accuracy as WM algorithm meanwhile can filter non-Chinese characters fast. The efficiency of the WM _CH algorithm has been improved by 40% compared to classic algorithms while the number of patterns is over 50,000. Experimental results show that the shifting number of sliding window have also been significantly reduced.

关 键 词:多模式匹配算法 特征串 Wu—Manber算法 WM_CH算法 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象