检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:郭文龙[1]
机构地区:[1]福建江夏学院电子信息科学学院,福建福州350108
出 处:《衡水学院学报》2014年第1期15-17,共3页Journal of Hengshui University
基 金:福建省教育厅A类科技项目(JA12335)
摘 要:客户关系数据库中拥有大量的客户记录,其中许多记录构成相似重复记录,检测、清洗进而合并相似重复记录可以提高存储空间的利用率,还可以加快记录查询的速度.在研究客户记录的基础上,提出一种客户关系数据库相似重复记录清洗算法,算法首先对记录进行排序,设定属性权重和记录相似度闸值,通过计算相邻记录的相似度判定记录是否相似重复,最后对检测到的相似重复记录进行清洗与合并.Customer relationship database has a large number of customer records, many of which constitute approximately duplicated records. Detecting, cleaning and then merging approximately duplicated records can improve storage utilization, and can also improve the speed of searching records. Based on the research of customer records, an algorithm which is used to clean approximately duplicated records in customer relationship database is proposed. In this algorithm, first, records are sorted;the property weight and records similarity values are set. Then by calculating the similarity between adjacent records, approximate or duplicate records are judged. Finally the detected approximately duplicated records are cleaned and merged.
分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222