机构地区:[1]杭州电子科技大学通信工程学院,浙江杭州310018
出 处:《杭州电子科技大学学报(自然科学版)》2025年第1期1-10,共10页Journal of Hangzhou Dianzi University:Natural Sciences
摘 要:复杂环境牛只目标检测是基于机器视觉的牛只数量精准盘点的关键问题,受牛只拥挤导致的遮挡以及牛只处于摄像头边缘位置导致的牛只个体不完整等因素的影响,现有的牛只目标检测方法无法适用于养殖场复杂环境。本文提出一种基于Swin-Transformer(SWT)-YOLOV5s网络的复杂环境牛只检测算法:首先提出双层shortcut(SC)-SWT与多层卷积级联的牛只特征提取骨干网络SWT-Backbone,利用SC-SWT模块关注全局特征的特点,结合关注局部特征的含残差多层卷积模块增加网络的深度和感受野,使模型充分提取牛只的全局特征与局部特征。然后提出not shortcut(NSC)-SWT与金字塔网络级联的特征融合目标检测头SWT-Head,通过金字塔网络构建SWT-Backbone骨干网络所提取的全局特征与局部特征多尺度融合的特征金字塔,结合NSC-SWT模块Transformer的全局感受野和C3模块的CNN局部感受野,使模型更准确地检测筛选特征金字塔中高语义层次的牛只全局特征及表征牛只细节的局部特征,同时高效去除背景环境的特征干扰,提升了复杂环境下牛只目标的检测精度。采用实验室采集的COWYCTC-1480数据集进行实验,与本文方法在测试集上的精确率、召回率和mAP分别高出YOLOV5s算法7%、2%、11.1%,高出SSD算法20.8%、32%、29.3%。The detection of cattle targets in complex environments is a key issue in precise counting of cattle numbers based on machine vision.Due to factors such as occlusion caused by overcrowding of cattle and incomplete cattle individuals caused by their position at the edge of the camera,existing cattle target detection methods are not suitable for complex environments in breeding farms.This paper proposes a cattle detection algorithm in complex environments based on Swin-Transformer(SWT)-YOLOV5s network:first,we propose a backbone network SWT Backbone for cattle feature extraction,which is cascaded by double-layer shortcut(SC)-SWT and multi-layer convolution.Using the characteristics of SC-SWT module that focuses on global features,combined with the residual multi-layer convolution module that focuses on local features,we increase the depth and receptive field of the network,so that the model can fully extract both global and local features of cattle.Then,the feature fusion target detection head SWT Head,which is cascaded by not shortcut(NSC)-SWT and pyramid network,is proposed.Through the pyramid network,the feature pyramid of multi-scale fusion of global features and local features extracted from the SWT Backbone backbone network is constructed.Combined with the global receptive field of the NSC-SWT module Transformer and the CNN local receptive field of the C3 module,it enables the model to more accurately detect and filter the global features of cattle at high semantic levels and local features that represent the details of cattle in the feature pyramid,while efficiently removes feature interference from the background environment,and improves the detection accuracy of cattle targets in complex environments.Simulation experiments were conducted on the COWYCTC-1480 dataset collected in the laboratory.Compared with the widely used YOLOV5s and SSD algorithms,the accuracy,recall,and mAP of our method on the test set were 7.0%,2.0%,and 11.1%points higher than the YOLOV5s algorithm,20.0%,8.0%,32.0%,and 29.3 points high
关 键 词:牛只目标检测 Swin-Transformer YOLOV5s SSD 复杂环境
分 类 号:TN391.41[电子电信—物理电子学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...