复句关系词自动识别中规则解析的包含匹配算法研究  被引量:3

Research on containing matching algorithm of rule-interpreter in the automatic recognition for relation word of Chinese compound sentences

在线阅读下载全文

作  者:胡金柱[1] 胡泉[2] 舒江波[3] 

机构地区:[1]华中师范大学计算机学院,武汉430079 [2]华中师范大学物理科学与技术学院,武汉430079 [3]华中师范大学国家数字化学习工程技术研究中心,武汉430079

出  处:《华中师范大学学报(自然科学版)》2014年第5期643-649,共7页Journal of Central China Normal University:Natural Sciences

基  金:国家社会科学基金项目(11BYY052);国家社会科学青年基金项目(13CYY037)

摘  要:规则解析器作为现代汉语复句关系词自动识别系统中的一个重要的功能模块,其主要功能是先利用复句准关系词去匹配规则库中的规则,然后对匹配成功的规则进行解析,最后调用该规则并提取规则的结论对复句关系词进行识别.因此规则的成功匹配是能够进行规则解析的首要条件.但是,在对规则库中的句式规则表和连用句式规则表进行匹配解析时,由于复句准关系词的多样性和重复性,造成了匹配的复杂性,使得无法利用传统的匹配算法去匹配规则.因此,该文研究了一种"包含匹配算法",该算法是先用一个二维数组将复句准关系词序列在复句中的下标依次存储,然后在该二维数组中寻找可能匹配的子串序列.该算法的最大优点是既不需要实现完全匹配和回溯,还可以包含模式串的所有子串,能够得到所有的目标子串,实验结果表明,该算法在排除规则的不完备性和分词的错误之后,正确率可以达到100%.For the Rule-Interpreter,which is an important function module of the system of Automatic Recognition about Relation Word of Chinese Compound Sentences,the first step is to match the sentences of the rule library,and then extracts the correct rules.So the success of matching is the first condition of rule analyzing.However when matching sentence table with orderly sentence table,it is so complex that the traditional matching algorithm can not be used,due to the diversity and the repeatability of the quasi relative words of the compound sentences.A new matching algorithm,containing matching algorithm,is proposed in the paper.The purpose of the algorithm is to find all of the substrings which are contained in the text strings.Containing matching uses a 2D array to save the relation words' subscript marks in multiple sentences,and then search for possible matching sequence in such 2D array.The greatest advantage of this algorithm is that it needs neither complete matching nor back,and it could contain mode list of all the substrings and get all of the target substrings.The experimental results show that the accuracy of the algorithm can reach 100%,excluding the incompleteness of the rules and participle mistakes.

关 键 词:复句关系词 自动识别 规则解析器 包含匹配算法 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象