检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周瑜[1] 贺建军[1,2] 顾宏[1] 张俊星[2]
机构地区:[1]大连理工大学电子信息与电气工程学部,辽宁大连116024 [2]大连民族大学信息与通信工程学院,辽宁大连116600
出 处:《计算机研究与发展》2016年第5期1053-1062,共10页Journal of Computer Research and Development
基 金:国家自然科学基金项目(61503058;61374170;61502074;U1560102);高等学校博士学科点专项科研基金项目(20120041110008);中央高校基本科研业务费专项资金项目(DC201501055;DC201501060201)~~
摘 要:在弱监督信息条件下进行学习已成为大数据时代机器学习领域的研究热点,偏标记学习是最近提出的一种重要的弱监督学习框架,主要解决在只知道训练样本的真实标记属于某个候选标记集合的情况下如何进行学习的问题,在很多领域都具有广泛应用.最大值损失函数可以很好地描述偏标记学习中的样本与候选标记间的关系,但是由于建立的模型通常是一个难以求解的非光滑函数,目前还没有建立基于该损失函数的偏标记学习算法.此外,已有的偏标记学习算法都只能处理样本规模比较小的问题,还没看到面向大数据的算法.针对以上2个问题,先利用凝聚函数逼近最大值损失函数中的max(·)将模型的目标函数转换为一个光滑的凹函数,然后利用随机拟牛顿法对其进行求解,最终实现了一种基于最大值损失函数的快速偏标记学习算法.仿真实验结果表明,此算法不仅要比基于均值损失函数的传统算法取得更好的分类精度,运行速度上也远远快于这些算法,处理样本规模达到百万级的问题只需要几分钟.In the age of big data ,learning with weak supervision has become one of the hot research topics in machine learning field . Partial label learning , which deals with the problem where each training example is associated with a set of candidate labels among which only one label corresponds to the ground-truth ,is an important weakly-supervised machine learning frameworks proposed recently and can be widely used in many real world tasks .The max-loss function may be used to accurately capture the relationship between the partial labeled sample and its labels .However ,since the max-loss function usually brings us a nondifferentiable objective function difficult to be solved ,it is rarely adopted in the existing algorithms .Moreover ,the existing partial label learning algorithms can only deal with the problem with small-scale data ,and rarely can be used to deal with big data .To cure above two problems , this paper presents a fast partial label learning algorithm with the max-loss function .The basic idea is to transform the nondifferentiable objective to a differentiable concave function by introducing the aggregate function to approximate the max (?) function involved in the max-lass function ,and then to solve the obtained concave objective function by using a stochastic quasi-New ton method . The experimental results show that the proposed algorithm can not only achieve higher accuracy but also use shorter computing time than the state-of-the-art algorithms with average-loss functions .Moreover ,the proposed algorithm can deal with the problems with millions samples within several minutes .
关 键 词:偏标记学习 最大值损失函数 凝聚函数 弱监督学习 分类精度
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117