周期分类和Single-Pass聚类相结合的话题识别与跟踪方法  被引量:28

A New Topic Detection and Tracking Approach Combining Periodic Classification and Single-Pass Clustering

在线阅读下载全文

作  者:税仪冬[1] 瞿有利[1] 黄厚宽[1] 

机构地区:[1]北京交通大学计算机与信息技术学院,北京100044

出  处:《北京交通大学学报》2009年第5期85-89,共5页JOURNAL OF BEIJING JIAOTONG UNIVERSITY

基  金:教育部科学技术研究重点项目资助(108126)

摘  要:针对增量式聚类初始时话题模型不够充分和准确,随处理报道数量增加,误检与漏检的累积效应被放大的问题,提出了周期分类和Single-Pass聚类相结合的话题识别与跟踪方法.首先采用增量式聚类算法进行话题识别与跟踪,当新闻文本每积累到一定程度之后,对已经聚类的报道进行周期分类,使话题簇精度提高,从而提高后续话题识别与跟踪精度.实验表明这种方法是有效的,能够降低漏检率与错检率,减少归一化错误识别代价.For the insufficient model and accuracy of incremental cluster topic, the problems of miss alarm and false alarm may be increased due to the accumulate effects. The topic detection and tracking method of periodic classification and signle-pass cluster was proposed in this paper, the main ideal is to employ the incremental clustering algorithm to detect and track topic, When the every news text accumulate to a certain degree, the clustering reports were cycle classifyed to improve the accuracy of topic clusters, and follow-up to improve the accuracy of topic detection and tracking. The experiment results shown the effectivity of the method, which could decrease the probabilities of miss alarm and false alarm, then finally reducing the normalized detection cost.

关 键 词:话题识别与跟踪 增量聚类 文本分类 k-最近邻方法分类 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象