检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:袁晓洁[1] 于士涛[1] 师建兴[1] 陈秋双[1]
出 处:《Journal of Southeast University(English Edition)》2008年第3期272-275,共4页东南大学学报(英文版)
基 金:Microsoft Research Asia Internet Services in Academic Research Fund(No.FY07-RES-OPP-116);the Science and Technology Development Program of Tianjin(No.06YFGZGX05900)
摘 要:To improve question answering (QA) performance based on real-world web data sets,a new set of question classes and a general answer re-ranking model are defined.With pre-defined dictionary and grammatical analysis,the question classifier draws both semantic and grammatical information into information retrieval and machine learning methods in the form of various training features,including the question word,the main verb of the question,the dependency structure,the position of the main auxiliary verb,the main noun of the question,the top hypernym of the main noun,etc.Then the QA query results are re-ranked by question class information.Experiments show that the questions in real-world web data sets can be accurately classified by the classifier,and the QA results after re-ranking can be obviously improved.It is proved that with both semantic and grammatical information,applications such as QA, built upon real-world web data sets, can be improved,thus showing better performance.为了改善真实网络数据集上自动问答系统的性能,定义出新的问题类别集合和通用的答案重新排序模型.问题分类器借助先验词典和语法分析,将语义和语法信息引入信息检索和机器学习方法,呈现为多种多样的训练属性,包括疑问词、中心动词、疑问词与中心动词依赖关系、中心助动词位置、中心名词、中心名词顶级上位词等.进而通过问题类别信息,对问答查询结果重新排序.实验表明:分类器能够精确实现真实网络数据集的问题分类,重新排序后的自动问答结果也能得到明显改善.这说明借助语义和语法信息,真实网络数据集上的自动问答系统等应用可以得到改善,显示出更好的性能.
关 键 词:question classification question answering real-world web data sets question and answer web forums re-ranking model
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.231