机构地区:[1]Ministry of Education-Microsoft Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology, Harbin 150001, China [2]Microsoft Research, One-Microsoft Way, Redmond, WA 98052, USA [3]Heilongjiang Institute of Technology, Harbin 150001, China [4]Peking University, Beijing 100871, China
出 处:《Journal of Electronics(China)》2008年第1期120-124,共5页电子科学学刊(英文版)
基 金:Supported by the High Technology Research and Devel-opment Program of China (No.2006AA01Z150);the Key Project of the National Natural Science Foundation of China (No.60373101);the Natural Science Foundation of Heilongjiang Province (No.F2007-14);the Project of Heilongjiang Outstanding Young University Teacher (No. 1151G037).
摘 要:This letter presents a new discriminative model for Information Retrieval (IR), referred to as Ordinal Regression Model (ORM). ORM is different from most existing models in that it views IR as ordinal regression problem (i.e. ranking problem) instead of binary classification. It is noted that the task of IR is to rank documents according to the user information needed, so IR can be viewed as ordinal regression problem. Two parameter learning algorithms for ORM are presented. One is a perceptron-based algorithm. The other is the ranking Support Vector Machine (SVM). The effec- tiveness of the proposed approach has been evaluated on the task of ad hoc retrieval using three English Text REtrieval Conference (TREC) sets and two Chinese TREC sets. Results show that ORM sig- nificantly outperforms the state-of-the-art language model approaches and OKAPI system in all test sets; and it is more appropriate to view IR as ordinal regression other than binary classification.This letter presents a new discriminative model for Information Retrieval (IR), referred to as Ordinal Regression Model (ORM). ORM is different from most existing models in that it views IR as ordinal regression problem (i.e. ranking problem) instead of binary classification. It is noted that the task of IR is to rank documents according to the user information needed, so IR can be viewed as ordinal regression problem. Two parameter learning algorithms for ORM are presented. One is a perceptron-based algorithm. The other is the ranking Support Vector Machine (SVM). The effectiveness of the proposed approach has been evaluated on the task of ad hoc retrieval using three English Text REtrieval Conference (TREC) sets and two Chinese TREC sets. Results show that ORM significantly outperforms the state-of-the-art language model approaches and OKAPI system in all test sets; and it is more appropriate to view IR as ordinal regression other than binary classification.
关 键 词:Information Retrieval (IR) Ordinal Regression PERCEPTRON Ranking Support Vector Machine (SVM)
分 类 号:TP391.2[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...