基于FP-Tree算法的汉语复句关系词依存关系规则的自动挖掘  

Automatic Mining of the Dependency Relation Rule of Relational Word in Chinese Compound Sentences Based on FP-Tree Algorithm

在线阅读下载全文

作  者:涂馨丹 

机构地区:[1]武汉设计工程学院,湖北 武汉

出  处:《计算机科学与应用》2021年第5期1538-1547,共10页Computer Science and Application

摘  要:目前关系词识别规则库中共有规则734条,主要是基于字面特征的规则,仍需补充基于依存关系的规则。本文在依存语法的基础上,运用挖掘频繁项集的FP-tree算法对复句中依存规则进行自动挖掘。首先对语料进行预处理,为避免每次重复扫描数据库,先根据关系词对复句进行分类;同时排除数据集过小的分类结果,以保证挖掘规则的质量;然后利用特征分析器分析预处理后的语料,并对分析结果进行形式化表示得到复句的依存特征集合;接着用FP-tree算法对实验语料进行规则挖掘,共挖掘规则84条。实验结果表明,FP-tree算法对依存规则进行自动挖掘的可行性和有效性。The relation word recognition rule base has 734 rules, which are mainly based on the characteristics of literal, and the rules based on dependencies still need to supplement. On the basis of dependency syntax, this paper uses the FP-tree algorithm of mining frequent item sets to automatically mine the dependency rules in complex sentences. First of all, the language material is preprocessed, in order to avoid each repeated scan of the database, first according to the relationship word to classify the complex sentences, at the same time, the small classification results of data sets are excluded to ensure the quality of mining rules, then, the preprocessed language material is analyzed by the feature analyzer, and the analysis results are formalized to represent the set of dependent features of the complex sentence, then, mining the experimental material by FP-tree algorithm, and a total of 84 rules are mined. The experimental results show that this algorithm is feasible and effective in automatic mining dependency rule.

关 键 词:关系词 依存关系 规则挖掘 FP-TREE 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象