检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《计算机技术与发展》2015年第12期143-146,151,共5页Computer Technology and Development
基 金:国家自然科学基金资助项目(61373135);江苏省高校自然科学研究重大项目(12KJA52003);南京邮电大学大学生科技创新训练计划(STITP)(201410293023Z)
摘 要:聚类分析是数据挖掘领域一项重要的课题。针对重复数据与孤立数据的预处理可以优化聚类结果。重复数据处理方面,文中在传统的重复数据查找算法SNM的基础上加入了伸缩窗口与变化移动速度的思想,提高了查找的准确率与效率;孤立数据方面,文中提出基于层次聚类分簇搜寻算法,算法利用层次聚类将数据分成独立的簇再依次搜寻孤立点提高了查询速率,并加入恢复检验的部分恢复被误删的非孤立点提高查找的准确率。实验仿真中,首先抽取部分数据验证了改进后的数据预处理算法的准确性,然后将数据预处理算法用于处理移动用户消费数据后再对数据进行聚类分析,从而达到对客户的归属地信息识别的目的。实验结果表明,文中提出的预处理算法具有很高的准确率与效率。Clustering analysis is an important project in data mining. Data preprocessing for repeated data and isolated data can optimize the result of clustering. About repeated data processing, added the idea of elastic window and changeable movement speed in traditional SNM to improve the accuracy and efficiency of searching. About isolated data processing, proposed a searching algorithm based on hierar- chical clustering and searching in divided clusters. Algorithm utilizes hierarchical clustering to divide the data into several independent clusters and sequentially search isolated point to improve the query speed. Meanwhile, algorithm adds recovery partial to recover isolated points which are misestimated to improve the accuracy of searching. In the experiment part,first extract the partial data to verify the accu- racy of the data preprocessing algorithm, next applies the algorithm for processing data of a list of consumption of mobile customers. Then make use of processed data to cluster in order to identify customers' information on their hometown. The experimental results indicate that the preprocessing algorithm proposed is accurate and efficient.
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.144.41.223