检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张素香[1,2] 文娟[1] 秦颖[1] 袁彩霞[1] 钟义信[1]
机构地区:[1]北京邮电大学信息工程学院智能科学技术研究中心,北京 100876 [2]华北电力大学电子与通信工程系,河北保定071003
出 处:《哈尔滨工程大学学报》2006年第B07期370-373,共4页Journal of Harbin Engineering University
基 金:国家863计划计算机主题重大基金资助项目(2001AA114210).
摘 要:针对实体关系的自动获取难题,将极大熵算法和Bootstrapping算法相结合,利用Bootstrapping算法和标量聚类的思想,通过设置种子模板和种子词获取了极大熵算法中所需的特征词.结合极大熵算法,从语言的形态学、语法、语义等方面系统地设计了9个特征,尽可能全方位地描述文实体的真实情况.搭建了实验所需的系统框架,实现了实体关系的自动抽取.实验结果表明:该方法能够有效地解决实体关系的自动生成问题.Entity Relation Extraction is solved in this paper. This approach is very different from previous one; the Maximum Entropy (ME)-based machine learning is combined with the Bootstrapping algorithm. Based on the Bootstrapping algorithm, seed words and seed patterns are used to build a learning program, which extracts more characteristic words using Scalar Clusters as the important feature of ME algorithm. These characteristic words have semantic similarity with seed words. Moreover, combined the ME algorithm, nine features have been designed for entity relation extraction in this paper, which include morphology, grammar and semantic feature, etc. The system architecture used for entity relation extraction has been constructed. Experiment shows that the performance is promising. So it is useful to extract automatic entity relation.
关 键 词:极大熵 BOOTSTRAPPING 特征选择 实体关系抽取 评测
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.119.99.38