基于Hausdorff距离的区间数据的系统聚类分析  被引量:8

Hierarchy Clustering Analysis of Interval Data based on Hausdorff Distance

在线阅读下载全文

作  者:郭均鹏[1] 谭智慧[1] 邓登[1] 

机构地区:[1]天津大学管理与经济学部,天津300072

出  处:《数理统计与管理》2014年第4期634-641,共8页Journal of Applied Statistics and Management

基  金:国家自然科学基金青年基金资助项目(70701026;71271147)

摘  要:基于Hausdorff距离用于定义两个紧集之间距离的考虑,将区间数视为一个紧集,定义了区间数之间的距离,并研究了区间向量的距离,从而得到聚类分析中两个样品间的距离。进一步定义了两个类之间的Hausdorff距离。为消除量纲对聚类结果的影响,研究了区间数据的标准化。基于此,给出了区间数据系统聚类算法。采用随机模拟的方法,对文中方法进行有效性评价,结论表明,Hausdorff距离法的聚类有效性在所有设计的实验条件下都要优于传统的欧式距离法。最后,基于符号数据分析的思想构造区间数据,给出了对多种动物群体按其身高、体重等生理特征进行聚类分析的算例。An interval being seen as a compact set, the distance between two interval numbers is defined based on ttausdorff distance which is used to define a distance between two compact sets. Furthermore, the distance between two interval vectors and two clusters were studied. To avoid the impact of different scales of the sample data, the normalization of interval data were studied. Based on this, the hierarchy clustering algorithm of interval data was proposed. A simulation study was conducted to evaluate our method. The results show that the method based on Hausdorff distance presented in the paper performs better than on Euclidean distance under all the situations designed in the simulation. Finally, an example of clustering several types of animals according to their heights and weights is given, where the interval data were achieved by the theory of symbolic data analysis.

关 键 词:区间数 聚类分析 HAUSDORFF距离 

分 类 号:O212.4[理学—概率论与数理统计]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象