基于交互基函数的数据流聚类算法研究  

Research on Data Stream Clustering Algorithm Based on Interactive Basis Function

在线阅读下载全文

作  者:黄承宁 李莉[1] 姜丽莉 徐平平[2] HUANG Cheng-ning;LI Li;JIANG Li-li;XU Ping-ping(Pujiang College of Nanjing University of Technology,Nanjing 211222,China;School of Information Science and Engineering,Southeast University,Nanjing 210096,China)

机构地区:[1]南京工业大学浦江学院,江苏南京211222 [2]东南大学信息科学与工程学院,江苏南京210096

出  处:《计算机技术与发展》2024年第3期28-34,共7页Computer Technology and Development

基  金:国家自然科学基金项目(61702229);南京工业大学浦江学院科研重点培育课题(njpj2022-1-07);南京工业大学浦江学院青年教师发展基金(PJYQ03)。

摘  要:聚类是数据挖掘的有效工具,数据流聚类成为当前研究热点,目前很多数据流聚类算法已经被提出,但大部分算法将距离作为相似度度量标准,存在对噪点敏感问题,且聚类效果不理想。为了增强数据流聚类算法的灵活性并提升聚类质量,该文将分数阶交互基函数(IBFs)引入数据流聚类,结合模糊ART算法对其进行了扩展,生成柔性决策边策略,提出了新颖的数据流聚类算法IBFs_ART。该算法首先对到达的数据点根据特征之间的相关性通过预计算函数特征扩展,并对原有特征进行分数阶变换,之后再基于交互基函数进行数据流聚类。交互基函数可生成灵活的决策边界且不需要指定软件,预计算函数可以在任何算法中实现,其可用于数据流聚类算法的任何扩展。经过实验表明,使用IBFs实现了较低计算成本生成灵活决策边界来找到最优聚簇,在相同警戒参数下实现了更高聚类质量和纯度,较传统聚类算法拥有更高的聚类精度、对称度量和更小的错误率。Clustering is an effective tool for data mining,and data stream clustering has become a hot topic in current research.Currently,many data stream clustering algorithms have been proposed,but most of them use distance as a similarity metric,which is sensitive to noise,and not ideal in clustering effect.In order to enhance the flexibility and improve the clustering quality of data flow clustering algorithms,we introduce fractional order interactive basis functions(IBFs)into data flow clustering,and combine them with fuzzy ART algorithm for expansion to generate flexible decision edge strategies.A novel data flow clustering algorithm,IBFs_ART,is proposed.The algorithm first expands the arrived data points through a pre calculated function based on the correlation between features,and performs fractional transformation on the original features.Then,it clusters the data streams based on interactive basis functions.Interactive basis functions can generate flexible decision boundaries without specifying software.Precomputing functions can be implemented in any algorithm,and can be used for any extension of data stream clustering algorithms.Experiments have shown that using IBFs can achieve lower computational costs and generate flexible decision boundaries to find the optimal clustering,achieve higher clustering quality and purity under the same alert parameters,and have higher clustering accuracy,symmetry metrics,and smaller error rates compared with traditional clustering algorithms.

关 键 词:聚类 数据流 数据流聚类 交互基函数 模糊自适应谐振理论 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象