检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李延超 肖甫[1] 陈志[1] 李博 LI Yan-Chao;XIAO Fu;CHEN Zhi;LI Bo(School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094,China)
机构地区:[1]南京邮电大学计算机学院软件学院网络空间安全学院,江苏南京210023 [2]南京理工大学计算机科学与工程学院,江苏南京210094
出 处:《软件学报》2020年第12期3808-3822,共15页Journal of Software
基 金:国家自然科学基金(61932013);江苏省自然科学基金(BK20200739);江苏省333高层次人才培养工程(BRA2020065)。
摘 要:主动学习从大量无标记样本中挑选样本交给专家标记.现有的批抽样主动学习算法主要受3个限制:(1)一些主动学习方法基于单选择准则或对数据、模型设定假设,这类方法很难找到既有不确定性又有代表性的未标记样本;(2)现有批抽样主动学习方法的性能很大程度上依赖于样本之间相似性度量的准确性,例如预定义函数或差异性衡量;(3)噪声标签问题一直影响批抽样主动学习算法的性能.提出一种基于深度学习批抽样的主动学习方法.通过深度神经网络生成标记和未标记样本的学习表示和采用标签循环模式,使得标记样本与未标记样本建立联系,再回到相同标签的标记样本.这样同时考虑了样本的不确定性和代表性,并且算法对噪声标签具有鲁棒性.在提出的批抽样主动学习方法中,算法使用的子模块函数确保选择的样本集合具有多样性.此外,自适应参数的优化,使得主动学习算法可以自动平衡样本的不确定性和代表性.将提出的主动学习方法应用到半监督分类和半监督聚类中,实验结果表明,所提出的主动学习方法的性能优于现有的一些先进的方法.Active learning algorithms attempt to overcome the labeling bottleneck by asking queries from a large collection of unlabeled examples.Existing batch mode active learning algorithms suffer from three limitations:(1)the models with assumption on data are hard in finding images that are both informative and representative;(2)the methods that are based on similarity function or optimizing certain diversity measurement may lead to suboptimal performance and produce the selected set with redundant examples;(3)the problem of noise labels has been an obstacle for active learning algorithms.This study proposes a novel batch mode active learning method based on deep learning.The deep neural network generates the representations(embeddings)of labeled and unlabeled examples,and label cycle mode is adopted by connecting the embeddings from labeled examples to those of unlabeled examples and back at the same class,which considers both informativeness and representativeness of examples,as well as being robust to noisy labels.The proposed active learning method is applied to semi-supervised classification and clustering.The submodular function is designed to reduce the redundancy of the selected examples.Moreover,the query criteria of weighting losses are optimized in active learning,which automatically trade off the balance of informative and representative examples.Specifically,batch mode active scheme is incorporated into the classification approaches,in which the generalization ability is improved.For semi-supervised clustering,the proposed active scheme for constraints is used to facilitate fast convergence and perform better than unsupervised clustering.To validate the effectiveness of the proposed algorithms,extensive experiments are conducted on diversity benchmark datasets for different tasks,and the experimental results demonstrate consistent and substantial improvements over the state-of-the-art approaches.
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.80