检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《计算机工程与设计》2008年第3期765-767,共3页Computer Engineering and Design
基 金:上海市重点学科建设基金项目(T0602);上海市教委科研基金项目(06FZ006)
摘 要:传统的关键字提取算法往往是基于高频词提取的,但文档中的关键字往往并不都是高频词,因此还需要从非高频词集中找出关键字。把一篇文档抽象为一个图:结点表示词语,边表示词语的同现关系;并基于文档的这种拓扑结构,提出了一种新的关键字提取算法,并和传统的关键字提取算法作了比较,在精确率、覆盖率方面均有不错的效果。Most ofkeyword extraction systems are utilized the high frequency for extracting keywords. Since the keywords of document are often not high-frequency words, we need to find keywords from set of non-high-frequency words. A document can be viewed as a graph: node can represent term and edge can represent the co-occurrence relation between terms. Based on this topology of document, a new keyword extraction algorithm is proposed and the traditional algorithms is compared. The experiment results show that our algorithm gets a certain advantage over the traditional algorithm in precision and coverage.
分 类 号:TP311.1[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15