检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:叶志鹏 姜枫[1] YE Zhipeng;JIANG Feng(Taizhou Institute of Science and Technology,Nanjing University of Science and Technology,Taizhou 225300,Jiangsu,China)
机构地区:[1]南京理工大学泰州科技学院,江苏泰州225300
出 处:《计算机工程》2024年第12期254-264,共11页Computer Engineering
基 金:江苏省高等学校自然科学研究面上项目(19KJB520038);江苏省“333人才工程”项目。
摘 要:随着生活水平的提升,人们对健康饮食的需求与日俱增,食品图像识别成为热门研究课题之一。食品加工和烹饪过程的不同造成了同类食品的形状和颜色存在差异,不同类别的食品也可能会呈现相似的视觉特征,因此食品图像的识别较一般图像识别难度更大。为了解决上述问题,提出基于多特征融合的食品图像分类网络MTFNet。首先,将图像的RGB彩色通道数据与局部二值模式(LBP)对应的纹理特征相融合作为骨干挤压和激励网络(SENet)的输入。接着,利用细节注意力模块挖掘不同位置上各通道的权重,进而对各层特征图进行局部增强,提升特征图局部表征能力。然后,利用自注意力机制计算特征图各通道之间的自注意力权重,挖掘特征图间的相关性,提取图像的全局特征。最后,将局部增强特征和全局特征拼接融合后进行图像分类。实验结果表明,在食品图像数据集ETH Food101、ChineseFoodNet和ISIA Food-500上,与目前最佳的多尺度拼图重构网络(MJR-Net)模型相比,MTFNet模型的Top-1准确率分别提高了0.44、1.01和0.66个百分点,取得了更好的识别性能。With improvements in living standards,the demand for a healthy diet is increasing daily,and the problem of food image recognition has become an important research topic.Owing to the different processing and cooking methods of food,the shape and color of similar food vary,and different types of food may present similar visual characteristics.Hence,the recognition of food images is more challenging than general image recognition.To solve these problems,a multi-feature fusion food image classification network,MTFNet,is proposed.First,the R,G,and B color channel data of the image are fused with the texture features corresponding to the local binary mode as the input of the backbone Squeeze and Excite Network(SENet).A detail attention module is then proposed to mine the weights of each channel at different positions,which can enhance the local information of the feature map of each layer and improve its local representation ability.Subsequently,the self-attention mechanism is applied to calculate the self-attention weights between each channel of the feature map,which can mine the correlation between the feature maps and extract the global features of the image.Finally,the locally enhanced and global features are concatenated and fused to classify the images.The experimental results indicate that the Top-1 accuracy of the MTFNet model is improved by 0.44,1.01,and 0.66 percentage points on the ETH Food101,ChineseFoodNet,and ISIA Food-500 food image datasets,respectively,as compared with Multi-scale Jigsaw Reconstruction Network(MJR-Net),achieving the best recognition performance.
关 键 词:食品图像分类 局部二值模式 挤压和激励网络 细节注意力 自注意力
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.70