机构地区:[1]Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China [2]College of Mathematics and Computer Science, Yan'an University, Yan'an 716000, China [3]Search Business Department Alibaba Group, Beijing 100020, China
出 处:《Science China(Information Sciences)》2017年第12期161-174,共14页中国科学(信息科学)(英文版)
基 金:supported by National Key Research and Development Program of China(Grant No.2016YFB1000902);National Basic Research Program of China(973 Program)(Grant No.2013CB329600);National Natural Science Foundation of China(Grant No.61472040);National Natural Science Basic Research Plan in Shaanxi Province of China(Grant No.2016JM6082)
摘 要:Classification is an essential task in data mining, machine learning and pattern recognition areas.Conventional classification models focus on distinctive samples from different categories. There are fine-grained differences between data instances within a particular category. These differences form the preference information that is essential for human learning, and, in our view, could also be helpful for classification models. In this paper, we propose a preference-enhanced support vector machine(PSVM), that incorporates preference-pair data as a specific type of supplementary information into SVM. Additionally, we propose a two-layer heuristic sampling method to obtain effective preference-pairs, and an extended sequential minimal optimization(SMO)algorithm to fit PSVM. To evaluate our model, we use the task of knowledge base acceleration-cumulative citation recommendation(KBA-CCR) on the TREC-KBA-2012 dataset and seven other datasets from UCI,Stat Lib and mldata.org. The experimental results show that our proposed PSVM exhibits high performance with official evaluation metrics.Classification is an essential task in data mining, machine learning and pattern recognition areas.Conventional classification models focus on distinctive samples from different categories. There are fine-grained differences between data instances within a particular category. These differences form the preference information that is essential for human learning, and, in our view, could also be helpful for classification models. In this paper, we propose a preference-enhanced support vector machine(PSVM), that incorporates preference-pair data as a specific type of supplementary information into SVM. Additionally, we propose a two-layer heuristic sampling method to obtain effective preference-pairs, and an extended sequential minimal optimization(SMO)algorithm to fit PSVM. To evaluate our model, we use the task of knowledge base acceleration-cumulative citation recommendation(KBA-CCR) on the TREC-KBA-2012 dataset and seven other datasets from UCI,Stat Lib and mldata.org. The experimental results show that our proposed PSVM exhibits high performance with official evaluation metrics.
关 键 词:preference SVM classification sampling sequential minimal optimization(SMO)
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...