基于关系网络和Vision Transformer的跨域小样本分类模型  

Cross-domain few-shot classification model based on relation network and Vision Transformer

在线阅读下载全文

作  者:严一钦 罗川 李天瑞[2] 陈红梅[2] YAN Yiqin;LUO Chuan;LI Tianrui;CHEN Hongmei(College of Computer Science,Sichuan University,Chengdu Sichuan 610065,China;School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu Sichuan 611756,China)

机构地区:[1]四川大学计算机学院,成都610065 [2]西南交通大学计算机与人工智能学院,成都611756

出  处:《计算机应用》2025年第4期1095-1103,共9页journal of Computer Applications

基  金:国家自然科学基金资助项目(62076171,62376230);四川省自然科学基金资助项目(2022NSFSC0898)。

摘  要:针对小样本学习模型在数据域存在偏移时分类准确度不高的问题,提出一种基于关系网络和ViT(Vision Transformer)的跨域小样本图像分类模型ReViT(Relation ViT)。首先,引入ViT作为特征提取器,并使用经过预训练的深层神经网络解决浅层神经网络的特征表达能力不足的问题;其次,以浅层卷积网络作为任务适配器提升模型的知识迁移能力,并基于关系网络和通道注意力机制构建非线性分类器;随后,将特征提取器和任务适配器进行特征融合,从而增强模型的泛化能力;最后,采取“预训练-元学习-微调-元测试”四阶段学习策略训练模型,有效融合迁移学习与元学习,进一步提升ReViT的跨域分类性能。以平均分类准确率为评估指标的实验结果表明,ReViT在跨域小样本分类问题上有良好的性能。具体地,ReViT的分类准确度在Meta-Dataset的域内场景下和域外场景下相较于次优的模型分别提升了5.82和1.71个百分点,在BCDFSL(Broader study of Cross-Domain Few-Shot Learning)数据集的3个子问题EuroSAT(European SA Tellite data)、CropDisease和ISIC(International Skin Imaging Collaboration)的5-way 5-shot上相较于次优的模型分别提升了1.00、1.54和2.43个百分点,在EuroSAT、CropDisease和ISIC的5-way20-shot上相较于次优的模型分别提升了0.13、0.97和3.40个百分点,在CropDisease的5-way 50-shot上相较于次优的模型提升了0.36个百分点。可见,ReViT能在样本量稀少的图像分类任务上保持良好的准确率。Aiming at the problem of poor classification accuracy of few-shot learning models in domain shift conditions,a cross-domain few-shot model based on relation network and ViT(Vision Transformer)—ReViT(Relation ViT)was proposed.Firstly,ViT was introduced as a feature extractor,and the pre-trained deep neural network was employed to solve the problem of insufficient feature expression ability of shallow neural network.Secondly,a shallow convolutional network was used as a task adapter to enhance the knowledge transfer ability of the model,and a non-linear classifier was constructed on the basis of the relation network and the channel attention mechanism.Thirdly,the feature extractor and the task adapter were integrated to enhance the generalization ability of the model.Finally,a four-stage learning strategy of“pre-training—meta-training—fine-tuning—meta-testing”was adopted to train the model,which further improved the cross-domain classification performance of ReViT by effective integration of transfer learning and meta learning.Experimental results using average classification accuracy as evaluation metric show that ReViT has good performance on cross-domain few-shot classification problems.Specifically,the classification accuracies of ReViT under in-domain scenarios and out-of-domain scenarios are improved by 5.82 and 1.71 percentage points,respectively,compared to the sub-optimal model on Meta-Dataset.The classification accuracies of ReViT are improved by 1.00,1.54 and 2.43 percentage points,respectively,compared to the sub-optimal model on 5-way 5-shot for three sub-problems EuroSAT(European SATellite data),CropDisease,and ISIC(International Skin Imaging Collaboration)of BCDFSL(Broader study of Cross-Domain Few-Shot Learning)dataset.The classification accuracies of ReViT are improved by 0.13,0.97,and 3.40 percentage points,respectively,compared to the sub-optimal model on 5-way 20-shot for EuroSAT,CropDisease,and ISIC.The classification accuracy of ReViT is improved by 0.36 percentage point compared t

关 键 词:小样本学习 关系网络 跨域学习 元学习 图像分类 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象