检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]北京交通大学计算机与信息技术学院,北京100044
出 处:《北京交通大学学报》2009年第5期85-89,共5页JOURNAL OF BEIJING JIAOTONG UNIVERSITY
基 金:教育部科学技术研究重点项目资助(108126)
摘 要:针对增量式聚类初始时话题模型不够充分和准确,随处理报道数量增加,误检与漏检的累积效应被放大的问题,提出了周期分类和Single-Pass聚类相结合的话题识别与跟踪方法.首先采用增量式聚类算法进行话题识别与跟踪,当新闻文本每积累到一定程度之后,对已经聚类的报道进行周期分类,使话题簇精度提高,从而提高后续话题识别与跟踪精度.实验表明这种方法是有效的,能够降低漏检率与错检率,减少归一化错误识别代价.For the insufficient model and accuracy of incremental cluster topic, the problems of miss alarm and false alarm may be increased due to the accumulate effects. The topic detection and tracking method of periodic classification and signle-pass cluster was proposed in this paper, the main ideal is to employ the incremental clustering algorithm to detect and track topic, When the every news text accumulate to a certain degree, the clustering reports were cycle classifyed to improve the accuracy of topic clusters, and follow-up to improve the accuracy of topic detection and tracking. The experiment results shown the effectivity of the method, which could decrease the probabilities of miss alarm and false alarm, then finally reducing the normalized detection cost.
关 键 词:话题识别与跟踪 增量聚类 文本分类 k-最近邻方法分类
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222