检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]华南理工大学计算机科学与工程学院,广州510640 [2]五邑大学计算机学院,广东江门529020
出 处:《计算机应用研究》2011年第4期1322-1324,共3页Application Research of Computers
基 金:广东省自然科学基金资助项目(9451064101003233);华南理工大学中央高校基本科研业务费专项资金资助项目(2009ZM0125;2009ZM0189;2009ZM0255)
摘 要:提出一种基于主题词集的文本自动文摘方法,用于自动提取文档文摘。该方法根据提取到的主题词集,由主题词权重进行加权计算各主题词所在的句子权重,从而得出主题词集对应的每个句子的总权重,再根据自动文摘比例选取句子权重较大的几个句子,最后按原文顺序输出文摘。实验在哈工大信息检索研究室单文档自动文摘语料库上进行,使用内部评测自动评估方法对获得的文摘进行评价,总体F值达到了66.07%。实验结果表明,该方法所获得的文摘质量高,较接近于参考文摘,取得了良好的效果。This paper proposed an automatic summarization method based on thematic term set for automatic extracting Abstracts from Chinese documents.According to the extracted thematic term set,the method calculated the sentence weights by the weights of the thematic terms,then got the corresponding total weight of each sentence,and selected several sentences with higher weight by percentage,and finally,output the summarization sentences by original order.Experiments were conducted on HIT IR-lab text summarization corpus,and utilized intrinsic automatic evaluation measures to evaluate the performance of the proposed method.Experimental results show that the proposed method achieves 66.07% upon the F-measure,which suggests it can generate higher quality summarization,nearly to the reference Abstract,achieving very good performance.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.222.143.148