基于频繁偏爱度的网站用户访问模式挖掘方法  

Method of Mining Website User Access Patterns Based on Frequent Preference

在线阅读下载全文

作  者:杜卫华 翁传芳 DU Wei-hua;WENG Chuan-fang(School of Economics and Management,Science and Technology College of NCHU,Jiujiang Jiangxi 332020,China)

机构地区:[1]南昌航空大学科技学院,江西九江332020

出  处:《计算机仿真》2022年第10期425-429,共5页Computer Simulation

基  金:江西省教育厅科学技术研究项目—基于数据挖掘技术的电商企业精准营销研究(GJJ218711)。

摘  要:采用目前方法挖掘网站用户的访问模式时,没有对网站信息进行过滤处理,导致方法存在挖掘效率低、挖掘准确率低和挖掘覆盖率低的问题。提出基于频繁偏爱度的网站用户访问模式挖掘方法,通过切分标志方法对网站信息进行预处理,采用向量空间模型结构化表示网站文档,对文档之间在网站中的相似度进行计算,根据计算结果实现网站信息的过滤处理。计算访问矩阵中行向量之间存在的Hamming距离矩阵,对比Hamming距离矩阵元素值与设定的相似度阈值,根据对比结果构建候选兴趣子路径2-项集,在子路径集中剔除频繁偏爱度低的子路径,通过合并处理获得用户偏爱浏览路径,实现网站用户访问模式的挖掘。仿真结果表明,所提方法的挖掘效率高、挖掘准确率高、挖掘覆盖率高。When mining website users’ access patterns, the current methods ignore filtering website information, resulting in low mining efficiency, low mining accuracy, and low mining coverage. Therefore, this paper presented a method to mine website users’ access patterns based on frequency and preference. At first, segmentation tags were adopted to preprocess the website information. And then vector space model was used to realize the structural expression of website documents. After that, the similarity between the documents in the website was calculated. According to the calculation result, the website information filtering was realized. Moreover, the Hamming distance matrix between row vectors in the access matrix was calculated. Meanwhile, the element values of the Hamming distance matrix were compared with the set similarity threshold. Based on the comparison result, the candidate interest sub-path two-item set was constructed. At this point, the sub-path with low frequent preference was eliminated from the sub-path set. After merging these sub-paths, the preferred browsing paths were obtained. Finally, we achieved the mining of website user access patterns. Simulation results prove that the proposed method has high mining efficiency, high mining accuracy, and high mining coverage.

关 键 词:频繁偏爱度 信息过滤 用户访问模式 距离矩阵 

分 类 号:TP312[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象