视觉表征学习综述

Review of Visual Representation Learning

作　　者：王帅炜雷杰冯尊磊梁荣华[1] WANG Shuaiwei;LEI Jie;FENG Zunlei;LIANG Ronghua(College of Computer Science and Technology,Zhejiang University of Technology,Hangzhou 310023,China;College of Computer Science and Technology,Zhejiang University,Hangzhou 310027,China)

机构地区：[1]浙江工业大学计算机科学与技术学院,杭州310023 [2]浙江大学计算机科学与技术学院,杭州310027

出　　处：《计算机科学》2024年第11期112-132,共21页Computer Science

基　　金：国家自然科学基金(62106226,62036009);浙江省自然科学基金(LQ22F020013,LDT23F0202)。

摘　　要：表征学习是人工智能算法中的重要一环,好的表征能够让后续的下游任务事半功倍。随着深度学习在计算机视觉领域的发展,视觉表征学习变得越来越重要,其目的是将复杂的视觉信息转换为更易于人工智能算法学习的表达。文中主要介绍了目前广泛使用的视觉表征学习的研究工作,根据数据依赖程度和类型的不同,将其划分为预训练视觉表征学习、生成式视觉表征学习、对比式视觉表征学习、解耦式视觉表征学习以及结合语言信息的视觉表征学习。具体而言,预训练视觉表征学习是基于有监督的预训练模型在视觉表征学习上的应用;生成式视觉表征学习利用生成模型学习视觉表征;对比式视觉表征学习主要介绍了利用对比学习思想来学习视觉表征的各类网络框架。此外,还介绍了利用变分自编码器和生成对抗网络在解耦式视觉表征学习中的应用,以及利用语言信息来增强视觉表征学习的各种方法。最后,总结了视觉表征学习的评价准则和未来展望。Representation learning is an important step of artificial intelligence algorithm,where well designed representation can boost downstream tasks.With the development of deep learning in computer vision,visual representation learning has become increasingly important,aiming at transforming complex visual information into representation that is easier for artificial intelligence algorithm to learn.In this paper,we focus on current research works widely used in visual representation learning,which are categorized as pre-trained visual representation learning,generative visual representation learning,contrastive visual representation learning,decoupled visual representation learning,and visual representation learning combined with language information accor-ding to the degrees and types of data dependency.Specifically,pre-trained visual representation learning is the application of supervised pre-training model in visual representation learning;generative visual representation learning uses generative model to learn visual representations;and contrastive visual representation learning focuses on the various network frameworks which using contrast learning to learn visual representations.Besides,the paper presents the applications of VAE and GAN in decoupled visual representation learning,as well as various approaches to improve visual representation learning with language information.Finally,evaluation metrics in visual representation learning and future perspectives are summarized.

关键词：视觉表征学习人工智能算法解耦式视觉表征学习语言信息

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

视觉表征学习综述

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

视觉表征学习综述

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索