检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《燕山大学学报》2014年第6期523-531,543,共10页Journal of Yanshan University
基 金:国家自然科学基金资助项目(61202022)
摘 要:在清洗算法不能有效地纠正不一致数据的情况下,"知情"用户给出的关于其正确取值的评论,对数据库的其他用户意义重大,可以帮助他们甄别错误数据,并在不丢失信息的前提下,尽可能地从不一致数据库中获取有用信息,但只有正确可信的评论才能有如此意义。因此,评论的可信度估算是这类应用中的一个关键问题。和互联网评论不同,数据库一般向系统内用户开放,用户的特征更易于提取,其语义确定。由于数据是对现实世界的描述,能对同一评论对象,发出类似评论的用户往往具有相同的背景或语义特征。文章提出了一种基于用户的特征分析的评论可信度计算算法,有针对性地解决了上述问题。算法首先根据语义特征,对历史评论者进行用户社区挖掘,得到在某准确度下评论过某对象的用户公共特征,形成用户模板;其次,对于任意给定新评论,通过其评论者和用户公共特征模板的匹配程度,并综合该评论者可信度、评论者和评论对象的语义相关性等关键因素,计算出该评论的可信度。实验证明,该算法在时间和准确率两方面都是有效的。In the application of inconsistent database which can't be cleaned, the reviews fromthe informed user which includs thecorrect values can help others identify the error data and get useful from the inconsistent database. But only reliable reviews aremeaningful, so calculation reliability of users' reviewis one of most important problems in this kind of application. Different fromthe internet, relational database based applications are generally accessed to dedicated users, the characteristics of the user can bemore easily to be extracted and their semantic meanings are determined. The users who submit the similar reviews to the sameobject commonly have the same background or semantic features.Based on that,an algorithmis presented to calculate the credibilityof users'reviews in this paper. The algorithmfirstly try to discover user's feature pattern bymining community of users who reviewedthe same object with the similar accuracy and achieving their common features, then for a new review, its credibility is evaluatedby the matching degree according to the matching degree of its reviewer and the user's feature pattern on the reviewed object. Besides,user's credibility and semantic relationship between the reviewer and the reviewed object are also considered in the evaluation.Experiment results show that the algorithmis efficient and valid on both time performance and correctness.
关 键 词:关系数据库 频繁子图挖掘 聚类 公共特征 可信度计算
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117