一种基于有向树挖掘Web日志中最大频繁访问模式的方法  被引量:9

A directed tree based approach for mining maximum frequent access patterns in Web logs

在线阅读下载全文

作  者:詹宇斌[1] 殷建平[1] 张玲[1] 龙军[1] 程杰仁[1] 

机构地区:[1]国防科学技术大学计算机学院,湖南长沙410073

出  处:《计算机应用》2006年第7期1662-1665,共4页journal of Computer Applications

基  金:国家自然科学基金资助项目(60373023)

摘  要:提出了一种基于Apriori思想的挖掘最大频繁访问模式的s-Tree算法。该算法使用有向树表示用户会话,能挖掘出最大前向引用事务和用户的浏览偏爱路径;使用一种基于内容页面优先的支持度计算方法,能挖掘出传统算法不能发现的特定的用户访问模式;使用频繁模式树连接分层的频繁弧克服了图结构数据挖掘算法中直接连接两个频繁模式树要判断连接条件的缺点,同时采用预剪枝策略,降低了算法的开销。实验表明,s-Tree算法具有可扩展性,运行效率比直接采用图结构数据挖掘算法要高。A novel Apriori-based algorithm named s-Tree was proposed for mining maximum frequent access pattems in Web logs. The main contributions of the novel algorithm were as follows. Firstly, the directed tree was used to represent the user session, which enabled us to mine the maximum forward reference transaction and the users' preferred access path. Secondly, a novel method for counting supporting degree based on content first, which helped us to discover some more important and interesting patterns than normal methods. Thirdly, two special strategies were adopted to reduce the overhead of jointing frequent pattems. Experiment results show that the s-Tree algorithm is scalable, and is more efficient than previous graph-based structure pattem mining algorithms such as AGM( Apriori-based Graph Mining) and FSG( Frequent Subgraph Discovery).

关 键 词:WEB使用挖掘 最大频繁访问模式 有向树 WEB日志 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象