检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]西安理工大学计算机科学与工程学院 [2]西安工程大学电子信息学院,陕西西安710048
出 处:《软件学报》2008年第3期674-686,共13页Journal of Software
基 金:Supported by the Key Science-Technology Project of Xi'an of China under Grant No.06JK225 (西安市科技攻关项目);the Research Program of the Education Department of Shannxi Province of China under Grant No.06JK225 (陕西省教育厅专项科研计划)
摘 要:分析了中英文混合环境下多模式匹配的特点,以及已有多模式匹配算法应用于中英文混合环境时的不足,给出并证明了中英文混合环境下多模式匹配算法的性能定理,提出了一种适合于中英文混合环境的基于线索完全哈希Trie结构的多模式匹配算法.该算法扩展了标准Trie结构,以中英文字符内码为键值构造完全哈希Trie匹配机,并利用模式串之间的关系对Trie匹配机进行线索化.理论分析与实验结果表明,所提出的算法在匹配中无需复杂的哈希运算,不需要回溯匹配指针,在中英文混合环境下能够进行正确、高效的匹配,而且不存在空间膨胀问题,具有较低的空间与时间复杂度,有较大理论与应用价值.The characteristics of multiple pattern matching in mixed Chinese and English text and the problem of the existing multiple pattern matching algorithms used for processing mixed Chinese and English text are analyzed. A theorem of multiple pattern matching in mixed Chinese and English text is discovered and proved. A novel multiple pattern matching algorithm based on the threaded trie tree is proposed, which expands the standard trie structure, constructs the hash trie matching machine with the codes of Chinese and English characters, and threads the trie tree according to the characteristic of patterns set. The proposed algorithm does not need complex hash operation, and the matching pointer does not need backdate during matching. Theoretic analysis and experimental results demonstrate that the proposed algorithm efficiently solves the space expansion problem, and process mixed Chinese and English text correctly and efficiently with lower time and space complexity.
分 类 号:TP301[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145