检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:于广川 贺瑞芳[1,2] 刘洋 党建武[1,2] YU Guang-Chuan HE Rui-Fang LIU Yang DANG Jian-Wu(School of Computer Science and Technology, Tianjin University, Tianjin 300350, China Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350, China School of Information Science and Technology, Peking University, Beijing 100871, China)
机构地区:[1]天津大学计算机科学与技术学院,天津300350 [2]天津市认知计算与应用重点实验室,天津300350 [3]北京大学信息科学技术学院,北京100871
出 处:《软件学报》2017年第10期2654-2673,共20页Journal of Software
基 金:国家重点基础研究发展计划(973)(2013CB329301);国家自然科学基金(61472277)~~
摘 要:任务中的一个重要分支,旨在从热点事件相关的海量推特流中总结出随时间演化的简要推特集,以帮助用户快速获取信息.推特作为当今最流行的社交媒体平台,其信息量爆发式的增长以及文本碎片的非结构性,使得单纯依赖文本内容的传统摘要方法不再适用.与此同时,社交媒体的新特性也为推特摘要带来了新的机遇.将推特流视作信号,剖析了其中的复杂噪声,提出融合推特流随时序变化的宏微观信号以及用户社交上下文语境信息的时序推特摘要新方法.首先,通过小波分析对推特流全局时序信息建模,实现某一关键词相关的热点子事件时间点检测;接着,融入推特流局部时序信息和用户社交信息建立推特的随机步图模型摘要框架,为每个热点子事件生成推特摘要.在算法评估过程中,对真实推特数据集进行了专家时间点和专家摘要的人工标注,实验结果表明了小波分析和融合了时序-社交上下文语境的图模型在时序推特摘要中的有效性.Temporal Twitter summarization is an important sub-task of text summarization, which aims to extract a concise tweet set with time, goes from a huge Twitter stream. It helps users quickly understand a specific event. As one of the most popular social media platforms, the explosive growth of Twitter information makes it difficult for users to find reliable and useful information. As tweets are short and highly unstructured, it makes traditional document summarization methods difficult to handle Twitter data. Meanwhile, Twitter also provides rich temporal-social context more than texts, bringing new opportunities. This paper considers Twitter stream as a kind of signal, and proposes a novel temporal Twitter summarization method by modeling macro-micro temporal context and social context through analyzing the complex noises hidden in signal. First, time points of hot sub-events are detected by modeling temporal context globally with wavelet analysis. Second, a novel random walk model is built on graph based unsupervised Twitter summarization framework, integrating both local temporal context and social user authority to generate summary for each sub-event time point. To evaluate the proposed framework, a real-world Twitter dataset, including expert time point and summary, is manually labeled. Experimental results show that wavelet analysis during hot sub-event time point detection and temporal-social context in Twitter summarization are both effective.
关 键 词:时序推特摘要 时序特性 用户社交权威性 小波去噪 上下文图模型
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.183