基于Markov逻辑网的两阶段数据冲突解决方法  被引量:11

2-Stage Data Conflict Resolution Based on Markov Logic Networks

在线阅读下载全文

作  者:张永新[1] 李庆忠[1] 彭朝晖[1] 

机构地区:[1]山东大学计算机科学与技术学院,济南250014

出  处:《计算机学报》2012年第1期101-111,共11页Chinese Journal of Computers

基  金:国家科技支撑计划(2009BAH44B02);国家自然科学基金(90818001;61003051);山东省科技攻关计划(2010GGX10108)资助~~

摘  要:在数据集成中,如何准确地解决数据冲突是关系集成数据质量的关键问题.现有的方法主要针对单个属性进行冲突解决,由于没有区分不同属性的冲突程度,也没有考虑不同属性间冲突解决的相互影响,导致数据冲突解决的准确率不高.针对现有方法存在的不足,文中提出一种基于Markov逻辑网的两阶段数据冲突解决方法.该方法可以根据冲突程度对属性进行划分,并分两阶段进行处理:(1)在第1阶段,对于弱冲突属性,利用投票规则及事实之间相互印证等简单规则进行冲突解决;(2)在第2阶段,利用了第1阶段冲突解决的结果,在规则中加入数据源与事实之间的相互影响规则、数据源之间相互依赖规则及弱冲突属性对强冲突属性影响规则,对强冲突属性进行冲突解决.通过在大量真实数据上的实验结果证明,该方法能够有效地解决集成数据的冲突问题,具有较高的准确率.In data integration,how to resolve the data conflicts accurately is a key issue that is closely related to the quality of integrated data.Current methods only consider single attribute,neither conflict degree nor mutual influence of different attributes are considered in data conflict resolution.It causes their accuracy not to be high.For the shortcomings of existing methods,a 2-stage approach for resolving data conflict based on Markov Logic Networks is proposed.This approach can divide different attributes according to their conflict degree and carry on 2-stage data conflict resolution:(1) In the first stage,the attributes which conflict degree is low can be resolved by simple rules such as voting and mutual verification of facts;(2) In the second stage,with the aid of the results from the first stage,the attributes which conflict degree is high can be resolve via adding some more complex rules such as mutual influence between sources and facts,inter-dependency of sources and low conflict degree attributes to high conflict degree attributes influence.Experimental results using a large number of real-world data show that the proposed approach can resolve the integrated data conflict effectively,which is more accurate.

关 键 词:数据冲突解决 MARKOV逻辑网 数据集成 冲突程度 推理规则 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象