基于有序聚类方程的数据相似性精准识别仿真  

Accurate Recognition Simulation of Data Similarity Based on Ordered Clustering Equation

在线阅读下载全文

作  者:张媛 张慧钧 ZHANG Yuan;ZHANG Hui-jun(School of Modern Manufacturing Engineering,Heilongjiang University of Technology,Jixi Heilongjiang 158100,China;College of modern Manufacturing Engineering,Yan'an University,Yanan Shannxi 716000,China)

机构地区:[1]黑龙江工业学院现代制造工程学院,黑龙江鸡西158100 [2]延安大学数学与计算机科学学院,陕西延安716000

出  处:《计算机仿真》2023年第4期402-406,共5页Computer Simulation

基  金:黑龙江省自然科学基金资助项目(LH2022A023)。

摘  要:网络环境中海量数据具有明显复杂度,存在着大量结构化、半结构化和非结构化的数据,数据块长度与位置易产生较高相似性。当前已有的相似性数据识别属于密集任务型方法,会占用大量的内存空间。为了进一步提高数据利用率,降低数据冗余度,提出基于有序聚类方程的数据相似性识别建模仿真的方法。利用小波技术和重复数据删除技术对网络数据降噪,通过预设数据集中心,完成网络数据特征向量的优化提取。基于此,从时间、空间双维度分析特征向量的相似度,以点云分类网络和有序聚类方程为基础,构建数据相似性识别模型。实验结果表明,利用研究方法识别数据相似性时,其归一化互信息值为0.12,说明上述方法的准确度较高,针对不同规模的待识别数据,研究方法可在0.6s之内完成全部数据相似性的识别。以上实验所得数据证明了该方法具有较高的应用准确率和效率。Massive data in the network environment has obvious complexity.There are many structured,semistructured and unstructured data.The length and location of data blocks are easy to produce high similarity.At present,the existing similarity data recognition is task intensive methods,which will occupy a lot of memory space.In order to further improve data utilization and reduce data redundancy,a simulation method of data similarity recognition based on ordered clustering equation was proposed.First,wavelet technology and data deduplication technology were used to reduce the noise of network data,and then network data feature vectors were optimized and extracted by presetting the data set center.On this basis,the similarity between feature vectors were analyzed from the dimension of time and space.Based on the point cloud classification network and ordered clustering equation,a model of identifying data similarity was constructed in the end.Following conclusions can be drawn from the experimental results.When the proposed method was adopted to identify data similarity,the normalized mutual information value is 0.12,indicating that the accuracy of method is high.For different sizes of data to be identified,the method can complete the identification of all data similarity within O.6s.These experimental data prove high application accuracy and efficiency of method.

关 键 词:小波技术 重复数据删除技术 特征向量相似度 点云分类网络 有序聚类方程 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象