检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:姜纪远[1] 陶卿[1] 高乾坤[1] 储德军[1]
机构地区:[1]中国人民解放军陆军军官学院十一系,安徽合肥230031
出 处:《软件学报》2014年第10期2282-2292,共11页Journal of Software
基 金:国家自然科学基金(61273296;60975040);安徽省自然科学基金(1308085QF121)
摘 要:AUC被广泛作为衡量不平衡数据分类性能的评价标准.与二分类问题不同,AUC问题的损失函数由来自两个不同类别的样本对组成.如何提高其实际收敛速度,是一个值得研究的问题.目前的研究结果表明:使用reservoir sampling技术的在线方法(OAM)表现出很好的AUC性能,但OAM仍存在诸如收敛速度慢、参数选择复杂等缺点.针对AUC优化问题的对偶坐标下降(AUC-DCD)方法进行了系统的研究,给出3种算法,即AUC-SDCD,AUCSDCDperm和AUC-MSGD,其中,AUC-SDCD和AUC-SDCDperm与样本数目有关,AUC-MSGD与样本数目无关.理论分析指出,OAM是AUC-DCD的一种特殊情形.实验结果表明,AUC-DCD在AUC性能和收敛速度两方面均优于OAM.研究结果表明,AUC-DCD是求解AUC优化问题的首选方法.AUC is widely used as a measure for the imbalanced classification problems. The AUC loss problem is a pairwise function between two instances from different classes, which is obviously different from that in standard binary classifications. How to improve its real convergence speed is an interesting problem. Recent study shows that the online method (OAM) using the reservoir sampling technique has better performance. However, there exist some shortcomings such as slow convergence rate and difficult parameter selection. This paper conducts a systematic investigation for solving AUC optimization problem by using the dual coordinate descent methods (AUC-DCD). It presents three kinds of algorithms: AUC-SDCD, AUC-SDCDperm and AUC-MSGD, where the first two algorithms depend on the size of training set while the last does not. Theoretical analysis shows that OAM is a special case of the AUC-DCD. Experimental results show that AUC-DCD is better than OAM on the AUC performance as well as the convergence rate. Therefore AUC-DCD is among the first optimization schemes suggested for efficiently solving AUC problems.
关 键 词:机器学习 优化方法 AUC 对偶坐标下降 支持向量机
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222