基于提示生成网络的Frozen ViT  

Input-based Frozen ViT Based on Prompt Generation Network

在线阅读下载全文

作  者:黄驰涵 HUANG Chihan(School of Design Art&Media,Nanjing University of Science and Technology,Nanjing 210094,China)

机构地区:[1]南京理工大学设计艺术与传媒学院,江苏南京210094

出  处:《计算机与网络》2024年第5期456-460,共5页Computer & Network

摘  要:随着计算机视觉中Transformer模型的引入,增加模型的数据量是实现更好性能和鲁棒性的绝佳方法。然而,当模型的参数达到亿级时,传统微调方法变得越来越有局限性,甚至有时不适用。因此,通过学习额外输入来调整模型的视觉提示模型成为处理冻结云模型的方法,既不需要前馈处理,也不需要后处理。提出了提示生成网络(Prompt Generative Network,PGN),通过端到端学习生成高性能的输入相关的提示。PGN能在预训练时适应各种训练集,在获取的数据集中优于以往方法,且模型参数减少了100倍。With the introduction of Transformer models in computer vision,increasing the amount of data in the model is an excellent way to achieve better performance and robustness.However,when the parameters of the model reach the level of 100 million,the traditional fine-tuning method becomes more and more limited,and sometimes even inapplicable.Therefore,a visual prompt model that adjusts the model by learning additional inputs becomes a way to deal with frozen cloud models that require neither feed-forward nor post-processing.This paper proposes a Prompt Generative Network(PGN)to generate high-performance input-related prompts through end-to-end learning.PGN can adapt to various training sets during pre-training,and its data set is better than previous methods,and the model parameters are reduced by 100 times.

关 键 词:提示生成网络 TRANSFORMER 计算机视觉 适应输入 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象