检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王帅炜 雷杰 冯尊磊 梁荣华[1] WANG Shuaiwei;LEI Jie;FENG Zunlei;LIANG Ronghua(College of Computer Science and Technology,Zhejiang University of Technology,Hangzhou 310023,China;College of Computer Science and Technology,Zhejiang University,Hangzhou 310027,China)
机构地区:[1]浙江工业大学计算机科学与技术学院,杭州310023 [2]浙江大学计算机科学与技术学院,杭州310027
出 处:《计算机科学》2024年第11期112-132,共21页Computer Science
基 金:国家自然科学基金(62106226,62036009);浙江省自然科学基金(LQ22F020013,LDT23F0202)。
摘 要:表征学习是人工智能算法中的重要一环,好的表征能够让后续的下游任务事半功倍。随着深度学习在计算机视觉领域的发展,视觉表征学习变得越来越重要,其目的是将复杂的视觉信息转换为更易于人工智能算法学习的表达。文中主要介绍了目前广泛使用的视觉表征学习的研究工作,根据数据依赖程度和类型的不同,将其划分为预训练视觉表征学习、生成式视觉表征学习、对比式视觉表征学习、解耦式视觉表征学习以及结合语言信息的视觉表征学习。具体而言,预训练视觉表征学习是基于有监督的预训练模型在视觉表征学习上的应用;生成式视觉表征学习利用生成模型学习视觉表征;对比式视觉表征学习主要介绍了利用对比学习思想来学习视觉表征的各类网络框架。此外,还介绍了利用变分自编码器和生成对抗网络在解耦式视觉表征学习中的应用,以及利用语言信息来增强视觉表征学习的各种方法。最后,总结了视觉表征学习的评价准则和未来展望。Representation learning is an important step of artificial intelligence algorithm,where well designed representation can boost downstream tasks.With the development of deep learning in computer vision,visual representation learning has become increasingly important,aiming at transforming complex visual information into representation that is easier for artificial intelligence algorithm to learn.In this paper,we focus on current research works widely used in visual representation learning,which are categorized as pre-trained visual representation learning,generative visual representation learning,contrastive visual representation learning,decoupled visual representation learning,and visual representation learning combined with language information accor-ding to the degrees and types of data dependency.Specifically,pre-trained visual representation learning is the application of supervised pre-training model in visual representation learning;generative visual representation learning uses generative model to learn visual representations;and contrastive visual representation learning focuses on the various network frameworks which using contrast learning to learn visual representations.Besides,the paper presents the applications of VAE and GAN in decoupled visual representation learning,as well as various approaches to improve visual representation learning with language information.Finally,evaluation metrics in visual representation learning and future perspectives are summarized.
关 键 词:视觉表征学习 人工智能算法 解耦式视觉表征学习 语言信息
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249