基于SOM聚类的用户信息数据自动挖掘算法研究  被引量:3

Research on automatic mining algorithm of user information data based on SOM clustering

在线阅读下载全文

作  者:黄红涛 徐婷[2] HUANG Hongtao;XU Ting(Information office of Central China Normal University,Wuhan 430079,China;Library of Central China Normal University,Wuhan 430079,China)

机构地区:[1]华中师范大学信息化办公室,武汉430079 [2]华中师范大学图书馆,武汉430079

出  处:《自动化与仪器仪表》2022年第7期26-30,共5页Automation & Instrumentation

基  金:省级《信息化与基础教育均衡发展省部共建协同创新中心重点项目》(ZDKT20210017)。

摘  要:大数据时代使得每个行业的用户信息数据急剧增加,数据量级呈现海量级别,在海量数据中挖掘有效的用户信息数据成为限制行业发展的阻碍之一,相关研究受到了大众的重点关注。基于爬虫技术获取用户信息数据,统一用户信息数据格式,应用模糊算法匹配用户信息数据,清除冗余数据,搭建SOM神经网络拓扑结构,确定数据聚类类别数目,通过SOM聚类算法处理用户信息数据,以上述聚类结果为依据,采用ATPRK方法推测数据需求尺度,对数据进行再次聚类,实现了用户信息数据的自动挖掘。实验数据显示:应用提出算法获得的用户信息数据自动挖掘耗时更短,用户信息数据聚类熵数值更小,充分证实了提出算法应用性能更佳。In the era of big data,the user information data of each industry increases sharply,and the data level presents a massive level.It is difficult to mine effective user information data in the massive data,which has become one of the obstacles restricting the development of the industry.As a result,the research related to data mining has attracted the attention of the public,and the research on automatic user information data mining algorithm based on SOM clustering is proposed.Obtain user information data based on crawler technology and unify user information data format.On this basis,apply fuzzy algorithm to match user information data,remove redundant data,build SOM neural network topology,determine the number of data clustering categories,process user information data through SOM clustering algorithm,and take the above clustering results as the basis,The atprk method is used to infer the data demand scale and cluster the data again,so as to realize the automatic mining of user information data.The experimental data show that the automatic mining of user information data obtained by the proposed algorithm takes less time,and the clustering entropy of user information data is smaller,which fully proves that the application performance of the proposed algorithm is better.

关 键 词:SOM聚类 用户 数据挖掘 信息数据 并行计算 大数据 

分 类 号:TP39[自动化与计算机技术—计算机应用技术] R195.1[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象