基于多特征融合的食品图像分类

Food Image Classification Based on Multi-Feature Fusion

作　　者：叶志鹏姜枫[1] YE Zhipeng;JIANG Feng(Taizhou Institute of Science and Technology,Nanjing University of Science and Technology,Taizhou 225300,Jiangsu,China)

机构地区：[1]南京理工大学泰州科技学院,江苏泰州225300

出　　处：《计算机工程》2024年第12期254-264,共11页Computer Engineering

基　　金：江苏省高等学校自然科学研究面上项目(19KJB520038);江苏省“333人才工程”项目。

摘　　要：随着生活水平的提升,人们对健康饮食的需求与日俱增,食品图像识别成为热门研究课题之一。食品加工和烹饪过程的不同造成了同类食品的形状和颜色存在差异,不同类别的食品也可能会呈现相似的视觉特征,因此食品图像的识别较一般图像识别难度更大。为了解决上述问题,提出基于多特征融合的食品图像分类网络MTFNet。首先,将图像的RGB彩色通道数据与局部二值模式(LBP)对应的纹理特征相融合作为骨干挤压和激励网络(SENet)的输入。接着,利用细节注意力模块挖掘不同位置上各通道的权重,进而对各层特征图进行局部增强,提升特征图局部表征能力。然后,利用自注意力机制计算特征图各通道之间的自注意力权重,挖掘特征图间的相关性,提取图像的全局特征。最后,将局部增强特征和全局特征拼接融合后进行图像分类。实验结果表明,在食品图像数据集ETH Food101、ChineseFoodNet和ISIA Food-500上,与目前最佳的多尺度拼图重构网络(MJR-Net)模型相比,MTFNet模型的Top-1准确率分别提高了0.44、1.01和0.66个百分点,取得了更好的识别性能。With improvements in living standards,the demand for a healthy diet is increasing daily,and the problem of food image recognition has become an important research topic.Owing to the different processing and cooking methods of food,the shape and color of similar food vary,and different types of food may present similar visual characteristics.Hence,the recognition of food images is more challenging than general image recognition.To solve these problems,a multi-feature fusion food image classification network,MTFNet,is proposed.First,the R,G,and B color channel data of the image are fused with the texture features corresponding to the local binary mode as the input of the backbone Squeeze and Excite Network(SENet).A detail attention module is then proposed to mine the weights of each channel at different positions,which can enhance the local information of the feature map of each layer and improve its local representation ability.Subsequently,the self-attention mechanism is applied to calculate the self-attention weights between each channel of the feature map,which can mine the correlation between the feature maps and extract the global features of the image.Finally,the locally enhanced and global features are concatenated and fused to classify the images.The experimental results indicate that the Top-1 accuracy of the MTFNet model is improved by 0.44,1.01,and 0.66 percentage points on the ETH Food101,ChineseFoodNet,and ISIA Food-500 food image datasets,respectively,as compared with Multi-scale Jigsaw Reconstruction Network(MJR-Net),achieving the best recognition performance.

关键词：食品图像分类局部二值模式挤压和激励网络细节注意力自注意力

分类号：TP391.4[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多特征融合的食品图像分类

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多特征融合的食品图像分类

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索