检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:赵亚琴[1] 周献中[2] 何新[1] 王建宇[1]
机构地区:[1]南京理工大学自动化学院,南京210094 [2]南京大学工程管理学院,南京210093
出 处:《模式识别与人工智能》2006年第3期289-294,共6页Pattern Recognition and Artificial Intelligence
基 金:江苏省自然科学基金(No.BK2004137)
摘 要:聚类分析是数据挖掘最常见的技术之一.数据的规模、维数和稀疏性都是制约聚类分析的不同方面.本文提出一种有效的高属性维稀疏数据聚类方法.给出稀疏相似度、等价关系的相似度、广义的等价关系的定义.基于对象间的稀疏相似度和等价关系原理形成初始等价类.通过等价关系的相似度修正初始等价关系.使得最终聚类结果更合理.该算法聚类过程不依赖于输入样本的排列顺序.高维稀疏数据的有效压缩提高算法在维数较高时的执行效率.适合于高维稀疏数据的聚类分析.Clustering analysis is one of the most important techniques in data mining with scale, dimension and sparseness of dataset being three key factors that influence accuracy of clustering . An effective clustering algorithm for the high attribute dimension sparse data is proposed in this paper. Definitions are given, such as sparse similarity, similarity between equivalence relations and generalized equivalence relation. Based on these definitions, the theory of equivalence relation is applied to form initial clusters. Initial equivalence relations are modified in terms of the similarity between two equivalence relations in order to obtain more reasonable clustering results. High dimensional sparse data is effectively compressed and expressed as sparse feature vector whose dimension is far lower than that of original data. As a result, the proposed approach can handle an array of high dimensional sparse data with high efficiency, and be independent of sequence of the objects.
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.128.78.139