检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张峰 王高才[1] ZHANG Feng;WANG Gao-cai(School of Computer and Electronic Information,Guangxi University,Nanning 530000,China)
机构地区:[1]广西大学计算机与电子信息学院,南宁530000
出 处:《小型微型计算机系统》2023年第12期2784-2790,共7页Journal of Chinese Computer Systems
基 金:国家自然科学基金项目(62062007)资助。
摘 要:细粒度视觉分类(FGVC)是计算机视觉的一个重要的研究分支,但是由于细粒度分类任务中图片由于变形,遮挡,光照差异等引起的同种类之间差异大和不同种类之间差异小的原因,使得它成为一项十分具有挑战性的任务.本篇论文通过改进MMAL-net(Multi-branch and Multi-scale Attention Learning for Fine-Grained Visual Categorization)算法以细粒度视觉分类的问题.本文的方法使用注意对象定位模块(ALOM)预测对象在图片中的位置,注意力部分建议模块(APPM)以在不需要边框或部分标注的情况下提出信息丰富的部分区域.得到的目标图像不仅包含了目标的几乎整个结构,而且包含了更多的细节,部分图像具有许多不同的尺度和更细粒度的特征,原始图像包含了完整的目标.三类图像由多分支网络进行监督学习.本文引入注意力机制使用Split-Attention模块对不同分支之间的输出进行权重再分配,并且引入SENet(Squeeze-and-Excitation Networks)使模型关注通道特征.本文的模型对不同尺度的图像具有良好的分类能力与鲁棒性,同时可以端到端进行训练并且有较短的推理时间.通过在CUB200-2011、FGVC-Airline和Stanford Cars数据集上的综合实验表明,本文的方法具有超越MMAL-net的分类性能,并且可以与最好的算法进行比较.Fine-grained Visual Categorization(FGVC)is a very important branch of computer vision.But it is still a challenging task due to high intra-class variances and low inter-class variances caused by deformation,occlusion and illumination,etc.In this paper,an improved Multi-branch and Multi-scale Attention Learning model is proposed for solve the problem of weakly supervised fine-grained visual classification better.The attention object location module(AOLM)can predict the position of the object and attention part proposal module(APPM)can propose informative part regions without the need of bounding-box or part annotations.The resulting image not only contains almost all the structure of the object but also contains more details,part images have many different scales and more fine-grained features,and the raw images contain the complete object.Three types of images are supervised by our multi-branch network.Attention mechanism is introduced into our model,and the Split-Attention module is used to redistribute the weights of the outputs of different branches.What′s more,the method of SENet(Squeeze-and-Excitation Networks)is introduced into our model to keep it channel-focused.Our model has good classification ability and robustness for images of different scales.Our approach can be trained end-to-end,while provides short inference time.Through the comprehensive experiments demonstrate that our approach has performance comparable to state-of-the-art results on CUB-200-2011,FGVC-Aircraft and Stanford Cars datasets.
关 键 词:细粒度视觉分类 弱监督学习 注意力机制 Split-Attention SENet
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.16.135.185