检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:涂馨丹
机构地区:[1]武汉设计工程学院,湖北 武汉
出 处:《计算机科学与应用》2021年第5期1538-1547,共10页Computer Science and Application
摘 要:目前关系词识别规则库中共有规则734条,主要是基于字面特征的规则,仍需补充基于依存关系的规则。本文在依存语法的基础上,运用挖掘频繁项集的FP-tree算法对复句中依存规则进行自动挖掘。首先对语料进行预处理,为避免每次重复扫描数据库,先根据关系词对复句进行分类;同时排除数据集过小的分类结果,以保证挖掘规则的质量;然后利用特征分析器分析预处理后的语料,并对分析结果进行形式化表示得到复句的依存特征集合;接着用FP-tree算法对实验语料进行规则挖掘,共挖掘规则84条。实验结果表明,FP-tree算法对依存规则进行自动挖掘的可行性和有效性。The relation word recognition rule base has 734 rules, which are mainly based on the characteristics of literal, and the rules based on dependencies still need to supplement. On the basis of dependency syntax, this paper uses the FP-tree algorithm of mining frequent item sets to automatically mine the dependency rules in complex sentences. First of all, the language material is preprocessed, in order to avoid each repeated scan of the database, first according to the relationship word to classify the complex sentences, at the same time, the small classification results of data sets are excluded to ensure the quality of mining rules, then, the preprocessed language material is analyzed by the feature analyzer, and the analysis results are formalized to represent the set of dependent features of the complex sentence, then, mining the experimental material by FP-tree algorithm, and a total of 84 rules are mined. The experimental results show that this algorithm is feasible and effective in automatic mining dependency rule.
分 类 号:TP3[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7