检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杨鹏跃 王锋[1] 魏巍[1] YANG Pengyue;WANG Feng;WEI Wei(School of Computer&Information Technology,Shanxi University,Taiyuan 030006,China)
机构地区:[1]山西大学计算机与信息技术学院,太原030006
出 处:《小型微型计算机系统》2025年第1期16-22,共7页Journal of Chinese Computer Systems
基 金:国家自然科学基金项目(62276158)资助;山西省回国留学人员科研项目(2021-007)资助。
摘 要:随着大规模预训练模型对视觉领域中的一般性数据的深入研究,当将其应用于特定下游任务时,若模型只训练分类头方法则极其依赖于预训练模型且效果一般;而全面微调预训练模型也因模型参数过大而变得不切实际;另外如VPT等视觉提示学习方法在图像数据集具有很大的数据多样性时,每个数据集的通用提示在向原始预训练数据分布转变时会带来极大的挑战.基于以上的种种挑战,本文提出一种新的提示学习方法,即在输入空间中添加特定任务的自注意力机制提示块,并在增强通道间的竞争条件下,引入极小的参数量进行预训练模型的自适应性调整,最终实现将视觉领域中具有一般性的特征信息应用于特定的视觉任务.实验以CNN和Transformer代表性的网络为基础模型并选取CIFAR、Tiny ImageNet等数据集,结果表明本文提出的方法相比常见的微调方法在平均准确率上提高了0.55%、1.86%.As large-scale pre-trained models are deeply studied for generalized data in the visual domain,when applying them to specific downstream tasks,if the models are trained only for classification head methods,they are extremely dependent on the pre-trained models and the results are mediocre;while fully fine-tuning the pre-trained models becomes impractical due to the overly large model parameters;and also visual prompt learning methods,such as VPT,are not effective in the case where the image datasets have a large diversity of data.Generalized prompts for each dataset pose a great challenge when shifting to the original pre-training data distribution.Based on the above challenges,this paper proposes a new prompt learning method,i.e.,adding task-specific self-attention mechanism prompt blocks in the input space,and introducing a very small number of parameters under enhanced inter-channel competition to adaptively adjust the pre-training model,and ultimately realizing the application of generalized feature information in the visual domain to a specific visual task.The experiments are conducted with the representative networks of CNN and Transformer as the base model and selected datasets such as CIFAR and Tiny ImageNet,and the results show that the method proposed in this paper improves the average accuracy by 0.55%and 1.86%compared with common fine-tuning methods.
关 键 词:模型的微调 数据多样性 提示学习 自注意力机制提示块 自适应性调整
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.171