检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]天津大学计算机科学与技术学院
出 处:《辽宁工程技术大学学报(自然科学版)》2007年第6期892-894,共3页Journal of Liaoning Technical University (Natural Science)
基 金:天津市科技发展计划基金资助项目(07JCZDJC067007)
摘 要:为了便于用户浏览网页信息,基于全置信度关联分析,提出了一种网页层次聚类的方法。该方法采用向量空间模型表示网页文档,将文档看成事务,文档的词汇视为事务中的项,根据关联挖掘算法发现文档之间的强关联规则产生基本类,然后利用图划分的算法完成网页文档的层次聚类。在关联规则产生过程中采用全置信度量发现强关联模式,规则的产生不受支持度阈值设置的影响,即使支持度阈值设置为零,也能发现强关联模式,有效地消除了弱相关的交叉支持模式。In order to facilitate users to browse web pages, an algorithm based on all-confidence association analysis is proposed. In this algorithm, Vector Space Model (VSM) is employed to represent web documents, in which web documents are represented as transactions and words in the web documents are considered as items of the transactions. According to the strong affinity association rules produced by association mining algorithms, base clusters are generated, and finally web pages are grouped in a hierarchical fashion by using graph partition method. During the process of association rules generation, all-confidence is used to discover strong affinity pattern, by which cross-support patterns are efficiently avoided and the support threshold has little influence to the association rules even if the threshold is set to zero.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.138.101.237