检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王晨光 WANG Chenguang
出 处:《科技创新与应用》2022年第18期18-21,共4页Technology Innovation and Application
摘 要:近年来,用户画像作为一种有效的大数据工具,在电子商务、社交网络等互联网行业得到广泛应用。然而,对于传统企业,数据维度往往较少,同时分散在多个信息系统,难以通过一般的方法得到较准确的结果。针对此问题,文章提出基于优化K-means聚类算法的用户画像方法,即同时利用K-means++初始聚类中心优化算法提高聚类精度、Mini Batch K-means小批量优化算法提高收敛速度,以充分结合二者的强互补性,提高算法的分析处理能力。基于企业数据和公开数据集的实验结果显示,相比经典K-means算法,该方法的速度和精度分别提高150倍、20%左右。In recent years,as an effective big data tool,user portraits have been widely applied in Internet industries,such ase-commerce and social networks.However,for traditional enterprises,where the data dimensions are usually small and scattered in multiple information systems,it is difficult to obtain accurate results through general methods.In response to this problem,the article proposes a user portraits method based on the optimized K-means clustering algorithms,namely,exploiting the K-means++initial clustering center optimization algorithm to improve the clustering accuracy and the Mini Batch K-means small batch optimization algorithm to improve the convergence speed,with the high complementarity of the two combined to improve the analysis and processing capabilities of the algorithm.The experimental results conducted on enterprise data and public data sets show that compared with the classic K-means,the speed and accuracy of this method are increased by about 150 times and 20%,respectively.
关 键 词:优化K-means均值算法 用户画像 聚类分析 有限维度 高分散度
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.38