检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《数据采集与处理》2001年第4期459-465,共7页Journal of Data Acquisition and Processing
基 金:国家 8 6 3计划 (编号 :86 3- 30 6 - ZD0 3- 0 4 - 1)资助项目
摘 要:字符串的模式匹配问题是计算机科学的基本问题之一 ,而近似模式匹配更是近期的研究热点。本文分析了文本分析领域中出现的一种特殊的近似模式匹配问题 ,即字符串有序跳跃模式近似匹配问题 ,提出了一种基于有限自动机的组件组合分析算法。算法的特点在于将组件匹配过程与组配过程进行分离 ,这样既降低了问题的复杂度 ,又可以实现按策略组配的灵活性。组件匹配过程中利用有限自动机对跳跃模式的组件进行匹配查找 ;组件的组配过程中先对查找到的组件进行组合分析 ,然后再对各种组合进行初步筛选和基于策略的优选。初步筛选工作是依据顺序性、唯一性和最大数三条原则进行 ;而优选工作是根据四个设计的评价参数选择其中最佳组合。实验结果表明 ,该算法的确能解决字符串有序跳跃模式匹配问题 。The string pattern matching is always viewed as a fundamental problem in computer science. Furthermore, the approximate pattern matching has attracted more interest recently and now becomes a new research field. This paper analyzes a kind of special approximate pattern matching problem i.e. sequential skip pattern matching in the field of text analysis. A component combinatorial analysis algorithm based on the deterministic state automata is presented. The main feature of the algorithm is that the process of component matching and assembling is separated. So, it not only decrease the complexity of problem, but also provide the flexible solution for assembling component depended on policy. The automata is used to search the components of the patterns in the process of matching component. In the process of assembling the components, combinatorial analysis is done firstly for the found components, and preliminary selection and further selection depended on po licy are made secondly. The preliminary selection is based on three principles: sequential rule, sole rule, and maximum occurrence number rule. And the further selection is based four parameters of evaluation to select the best result. Experiments demonstrate that the algorithm can solve such approximate string matching problem, and it can be used in the filed of sentence pattern matching and skip word matching.
关 键 词:有限状态自动机 字符串 模式匹配 计算机 有序跳跃模式
分 类 号:TP391.43[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.225