检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]西安邮电学院计算机系,陕西西安710121 [2]西安邮电学院信息中心,陕西西安710121 [3]西安电子科技大学图书馆,陕西西安710071
出 处:《西安邮电学院学报》2007年第5期82-86,共5页Journal of Xi'an Institute of Posts and Telecommunications
摘 要:实现了一个利用小世界网络模型(SWN)提取中文文档的关键字的系统。小世界网络模型具有两个统计性质:平均路径长度和聚类系数。本系统使用的算法首先对文档进行分词,以分词之间的相邻关系为边、以分词为节点构造文档结构图。然后计算每一个分词的平均路径长度变化量和聚类系数变化量,并且使用这两个变化量作为提取关键字的标准,最后按照一定策略合并关键字成复合关键字。本文首先详细介绍了小世界网络模型的概念和在关键字提取方面的应用,然后介绍了本系统的设计与实现,最后通过实验证明了该算法的正确性和有效性。By using a model of small world network (SWN), a system extracting keywords from Chinese documents is implemented. SWN has two statistical properties which are Average Path Length and Average Clustering Coefficient. Firstly a Chinese document is decomposed into single terms, and it is represented by a network: the nodes represent terms, and the edges represent the co - occurrence of terms, which can describe the semantic association relation between single terms of the document. Next the Average Path Length and Average Clustering Coefficient of each term are computed, and they are used to extracting keywords. Finally the extracted keywords are combined as compound keywords. This paper first introduces concepts about small world network and describes the application of SWN in keyword extracting. Then it shows the way to design and implement the system in detail. The experiment results show that the system is both reasonable and effective.
关 键 词:小世界网络 关键字提取 平均路径长度变化量 聚类系数变化量
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.74