基于路径聚类的页面访问次序的挖掘  被引量:2

Mining based on page visiting sequence of path clustering

在线阅读下载全文

作  者:张春娜[1] 李轶然[1] 

机构地区:[1]辽宁科技大学软件学院,辽宁鞍山114051

出  处:《计算机工程与设计》2013年第1期303-306,313,共5页Computer Engineering and Design

摘  要:为了发现用户的行为模式以实现Web站点的结构优化,提出了基于用户访问路径的K-PathSearch算法。在对网页实施预处理后,结合页面链接参数,建立用户访问事务处理模型,形成有用数据集。提取样本分析用户的兴趣度,主要影响因素体现在访问次序、次数以及停留时间三方面,并利用重新定义的相似度将兴趣取向相类似的用户划分为一类;在此基础上,定义用户访问最长拟合路径,进而计算路径聚类中心。经计算,聚类数和聚类中心平均长度增比显著,表明模型和算法是可行和有效的。In order to find the user behavior patterns to achieve the optimization of website structure. K-PathSearch algorithm is proposed based on user access path. First, combined with the page link parameters after web-page preprocessing, user access transaction processing model is established and useful data set is formed. Furthermore, user interest degree is analyzed on sam- ples. The three main affect factors reflected in access order, frequency and length of stay. Users which have the same interest are divided into a class after similarity of interest degree is redefined. Based on the user access, we can define the longest fitting path of user access and then calculate the path clustering center. The calculation shows that, the growth of the number of cluster and the cluster center average length is significant. It is proved that the model and algorithm are feasible and effective.

关 键 词:聚类 路径聚类 用户访问事物 K-PathSearch算法 聚类中心 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象