基于聚类和辅助词典的模式匹配方法  被引量:1

A schema matching approach based on clustering and auxiliary dictionary

在线阅读下载全文

作  者:刘国峰[1] 黄少滨[1] 程媛[1] 郎大鹏 

机构地区:[1]哈尔滨工程大学计算机科学与技术学院,黑龙江哈尔滨150001

出  处:《哈尔滨工程大学学报》2013年第2期214-220,共7页Journal of Harbin Engineering University

基  金:国家科技支撑计划项目(2009BAH42B02);国家自然科学基金项目(60873038;60903080);哈尔滨工程大学中央高校基本科研业务专项资金项目(100603)

摘  要:针对中文环境下的模式冲突问题,提出了一种利用元数据的模式匹配方法.该方法从数据字典中为模式提取特征向量,并采用聚类技术对其进行聚类,将语义相近的模式划分到相同聚簇中;对于同一聚簇中的不同模式,借助辅助词典计算属性间的语义相似度,并采用多种选择策略相结合的方法对结果进行过滤,为每个属性生成候选匹配集合.实验结果表明,该方法不仅可以提高模式匹配效率,而且具有较高的准确度.For the problem of schema conflict in Chinese environment, a novel metadata-based schema matching method was proposed. Firstly, a feature vector was extracted for each schema from database dictionary, and the clustering technique was performed on the vectors, then the similar schemas in semantics were divided into the same clusters. Secondly, for different schemas in the same cluster, the semantic similarities between attributes were calculated, with the help of auxiliary dictionary. Finally, a method combing a variety of strategies was used to filter the results, and the candidate matching set for each attribute was generated. The experimental results show that the proposed method can not only increase the efficiency of schema matching, but also have a higher accuracy.

关 键 词:模式匹配 聚类技术 辅助词典 语义相似度 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象