检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]西北大学信息科学与技术学院,西安710127 [2]北京师范大学信息科学与技术学院,北京100875
出 处:《计算机应用研究》2008年第8期2295-2298,2308,共5页Application Research of Computers
基 金:国家科技基础条件平台应用服务支撑系统资助项目(2005DKA63900)
摘 要:在国家科技基础条件平台中如何建设汉语字词之间的语义关系库,并且利用初始的语义关系库自动获取句法模式和新的关系。使用了句法模式的概念,并提出了利用已有关系发现新模式、利用已有模式发现新关系的方法,创造性地设计相关模型并实现了一个中文语义关系知识库系统。利用此系统结合自然语言处理相关技术,从搜狗语料库和百度百科页面文件中大规模自动化获取了有效关系200多个,并从中提取了继承、同义等有效的新关系1 000多条。实验证明其效率达到约40%,主要取决于关系中查询词的距离取值和语料库本身的性质。This paper focused on an automatic approach to build a semantic relationship database in the national science and technology infrastructure platform, identified lexical patterns and extended new semantic relationships by existing ones from corpus. In fact there were a lot of potential relationships between words, and these words could be connected to a big network by them. So the problem was how to model this network and how to get relationships automatically. With the concept of lexical pattern, devised a new method: generalized new patterns form the existing relationships and generalized new relationships from existing patterns. This paper designed and realized a Chinese semantic relationships knowledgebase system. Using this system and NLP technology, extracted more than 200 effective relationships and more than 1 000 new relationships (such as inherit and synonym) from Sogou corpus and Baidu Baike. The experiment result shows that the precision of these relationships is around 40%, depends on the distance between the searching words and the type of articles in corpus.
分 类 号:TP301.2[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.4