检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:詹宇斌[1] 殷建平[1] 张玲[1] 龙军[1] 程杰仁[1]
机构地区:[1]国防科学技术大学计算机学院,湖南长沙410073
出 处:《计算机应用》2006年第7期1662-1665,共4页journal of Computer Applications
基 金:国家自然科学基金资助项目(60373023)
摘 要:提出了一种基于Apriori思想的挖掘最大频繁访问模式的s-Tree算法。该算法使用有向树表示用户会话,能挖掘出最大前向引用事务和用户的浏览偏爱路径;使用一种基于内容页面优先的支持度计算方法,能挖掘出传统算法不能发现的特定的用户访问模式;使用频繁模式树连接分层的频繁弧克服了图结构数据挖掘算法中直接连接两个频繁模式树要判断连接条件的缺点,同时采用预剪枝策略,降低了算法的开销。实验表明,s-Tree算法具有可扩展性,运行效率比直接采用图结构数据挖掘算法要高。A novel Apriori-based algorithm named s-Tree was proposed for mining maximum frequent access pattems in Web logs. The main contributions of the novel algorithm were as follows. Firstly, the directed tree was used to represent the user session, which enabled us to mine the maximum forward reference transaction and the users' preferred access path. Secondly, a novel method for counting supporting degree based on content first, which helped us to discover some more important and interesting patterns than normal methods. Thirdly, two special strategies were adopted to reduce the overhead of jointing frequent pattems. Experiment results show that the s-Tree algorithm is scalable, and is more efficient than previous graph-based structure pattem mining algorithms such as AGM( Apriori-based Graph Mining) and FSG( Frequent Subgraph Discovery).
关 键 词:WEB使用挖掘 最大频繁访问模式 有向树 WEB日志
分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.3