检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]哈尔滨工业大学计算机科学与技术学院,黑龙江哈尔滨150001
出 处:《软件学报》2010年第6期1267-1276,共10页Journal of Software
基 金:国家自然科学基金Nos.60803093;60675034;国家高技术研究发展计划(863)No.2008AA01Z144~~
摘 要:以动宾关系的搭配为例研究复述搭配的抽取.具体地,该方法将复述搭配抽取视作二元分类问题,并综合使用了基于翻译、词典、极性词以及网络挖掘的多种特征.实验结果表明,所采用的二元分类方法对于抽取复述搭配是行之有效的,其中使用的各种特征对于提高复述搭配抽取的效果皆有帮助.利用该方法,共抽取出28万余对的复述搭配,其准确率超过70%.进一步的实验结果表明,使用抽取的复述搭配,可以为约40%的句子实现复述生成,从而说明了该方法的实际应用价值.This paper addresses the problem of paraphrase collocation extraction by using "OBJ" relationship as a case study.Specifically,the proposed method recasts paraphrase collocation extraction as a binary classification problem,which combines multiple features based on translation,thesaurus,polarity words,and web mining.Experimental results show that the binary classification-based method is effective for paraphrase collocation extraction.Especially,the exploited features are all helpful for improving the extraction performance.With the proposed method,more than 280 000 pairs of paraphrase collocations are extracted,the precision of which is above 70%.Further experiments show that nearly 40% of sentences can be paraphrased by using the extracted paraphrase collocations,which demonstrates that the proposed method is useful in practice.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.3