面向CNN和Transformer的自注意力机制自适应性提示学习  

Self-attention Mechanism Adaptive Prompt Learning for CNNs and Transformers

在线阅读下载全文

作  者:杨鹏跃 王锋[1] 魏巍[1] YANG Pengyue;WANG Feng;WEI Wei(School of Computer&Information Technology,Shanxi University,Taiyuan 030006,China)

机构地区:[1]山西大学计算机与信息技术学院,太原030006

出  处:《小型微型计算机系统》2025年第1期16-22,共7页Journal of Chinese Computer Systems

基  金:国家自然科学基金项目(62276158)资助;山西省回国留学人员科研项目(2021-007)资助。

摘  要:随着大规模预训练模型对视觉领域中的一般性数据的深入研究,当将其应用于特定下游任务时,若模型只训练分类头方法则极其依赖于预训练模型且效果一般;而全面微调预训练模型也因模型参数过大而变得不切实际;另外如VPT等视觉提示学习方法在图像数据集具有很大的数据多样性时,每个数据集的通用提示在向原始预训练数据分布转变时会带来极大的挑战.基于以上的种种挑战,本文提出一种新的提示学习方法,即在输入空间中添加特定任务的自注意力机制提示块,并在增强通道间的竞争条件下,引入极小的参数量进行预训练模型的自适应性调整,最终实现将视觉领域中具有一般性的特征信息应用于特定的视觉任务.实验以CNN和Transformer代表性的网络为基础模型并选取CIFAR、Tiny ImageNet等数据集,结果表明本文提出的方法相比常见的微调方法在平均准确率上提高了0.55%、1.86%.As large-scale pre-trained models are deeply studied for generalized data in the visual domain,when applying them to specific downstream tasks,if the models are trained only for classification head methods,they are extremely dependent on the pre-trained models and the results are mediocre;while fully fine-tuning the pre-trained models becomes impractical due to the overly large model parameters;and also visual prompt learning methods,such as VPT,are not effective in the case where the image datasets have a large diversity of data.Generalized prompts for each dataset pose a great challenge when shifting to the original pre-training data distribution.Based on the above challenges,this paper proposes a new prompt learning method,i.e.,adding task-specific self-attention mechanism prompt blocks in the input space,and introducing a very small number of parameters under enhanced inter-channel competition to adaptively adjust the pre-training model,and ultimately realizing the application of generalized feature information in the visual domain to a specific visual task.The experiments are conducted with the representative networks of CNN and Transformer as the base model and selected datasets such as CIFAR and Tiny ImageNet,and the results show that the method proposed in this paper improves the average accuracy by 0.55%and 1.86%compared with common fine-tuning methods.

关 键 词:模型的微调 数据多样性 提示学习 自注意力机制提示块 自适应性调整 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象