一种迭代式的概念属性名称自动获取方法被引量：4

An Interative Approach to Automatic Attribute Acquisition

机构地区：[1]中国科学院计算技术研究所,北京100190 [2]中国科学院大学,北京100049

出　　处：《中文信息学报》2014年第4期58-67,共10页Journal of Chinese Information Processing

基　　金：国家自然科学基金(91224006;61203284;61173063;61035004;30973716);国家社科基金(10AYY003)

摘　　要：属性是一种用于描述概念和鉴别概念的特殊知识。属性名称是表示属性的专有名词。该文提出了一种基于前后缀迭代的方法,从Web网页中获取概念的属性名称。该方法的每一次迭代分为两个阶段:(1)从现有种子属性集中选择合适的前后缀,构造词汇-句法模式,从Web网页中提取候选属性;(2)采用基于相似性的验证模型对候选属性进行验证,以扩充现有属性集合。该文提出了一组验证模型对候选属性进行验证,比较各个模型的优缺点,并在地域类和商业主体类概念上分别得到了平均92.9%和90.7%的准确率,以及对原有种子属性集合近100倍的扩充率。Attributes are a special type of knowledge which is used to describe and identify concepts. Attribute names are proper nouns to express attributes. This paper presents a prefix-- and suffix--based method to extract attrib- utes iteratively from Web pages. In this method, each iteration consists of two phases. （1） Selecting a set of appro- priate attribute prefixes and suffixes from the existing attribute seeds, and generating lexico--syntactic patterns to extract candidate attributes from Web pages. （2） Using a similarity--based model to validate candidate attributes to expand the existing set of seed attributes. We propose a group of validation models, and then compare the advanta- ges and disadvantages of each model. We evaluate our method on a group of concepts in the geographic class and business class. Comprehensive experiments show that an average of 92.9% and 90.7% precision are obtained, re- spectively, and the original set of seed attributes are expanded nearly＇ 100 times.

关键词：概念属性属性前缀属性后缀属性元知识获取

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种迭代式的概念属性名称自动获取方法被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种迭代式的概念属性名称自动获取方法 被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

一种迭代式的概念属性名称自动获取方法被引量：4