检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:侯志强 董佳乐 马素刚 王晨旭[1,2] 杨小宝 王昀琛 HOU Zhiqiang;DONG Jiale;MA Sugang;WANG Chenxu;YANG Xiaobao;WANG Yunchen(Institute of Computer,Xi’an University of Posts and Telecommunications,Xi’an 710121,China;Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing,Xi’an University of Posts and Telecommunications,Xi’an 710121,China)
机构地区:[1]西安邮电大学计算机学院,西安710121 [2]西安邮电大学陕西省网络数据分析与智能处理实验室,西安710121
出 处:《电子与信息学报》2024年第11期4198-4207,共10页Journal of Electronics & Information Technology
基 金:国家自然科学基金(62072370);陕西省自然科学基金(2023-JC-YB-598)。
摘 要:针对记忆网络算法中多尺度特征表达能力不足和浅层特征没有充分利用的问题,该文提出一种多尺度特征增强与全局-局部特征聚合的视频目标分割(VOS)算法。首先,通过多尺度特征增强模块融合可参考掩码分支和可参考RGB分支的不同尺度特征信息,增强多尺度特征的表达能力;同时,建立了全局-局部特征聚合模块,利用不同大小感受野的卷积操作来提取特征,并通过特征聚合模块来自适应地融合全局区域和局部区域的特征,这种融合方式可以更好地捕捉目标的全局特征和细节信息,提高分割的准确性;最后,设计了跨层融合模块,利用浅层特征的空间细节信息来提升分割掩码的精度,通过将浅层特征与深层特征融合,能更好地捕捉目标的细节和边缘信息。实验结果表明,在公开数据集DAVIS2016,DAVIS2017和YouTube-2018上,该文算法的综合性能分别达到91.8%、84.5%和83.0%,在单目标和多目标分割任务上都能实时运行。To address the issues of insufficient multi-scale feature expression ability and insufficient utilization of shallow features in memory network algorithms,a Video Object Segmentation(VOS)algorithm based on multi-scale feature enhancement and global local feature aggregation is proposed in this paper.Firstly,the multi-scale feature enhancement module fuses different scale feature information from reference mask branches and reference RGB branches to enhance the expression ability of multi-scale features;At the same time,a global local feature aggregation module is established,which utilizes convolution operations of different sizes of receptive fields to extract features,through the feature aggregation module,the features of the global and local regions are adaptively fused.This fusion method can better capture the global features and detailed information of the target,improving the accuracy of segmentation;Finally,a cross layer fusion module is designed to improve the accuracy of masks segmentation by utilizing the spatial details of shallow features.By fusing shallow features with deep features,it can better capture the details and edge information of the target.The experimental results show that on the public datasets DAVIS2016,DAVIS2017,and YouTube 2018,the comprehensive performance of our algorithm reaches 91.8%,84.5%,and 83.0%,respectively,and can run in real-time on both single and multi-objective segmentation tasks.
关 键 词:视频目标分割 记忆网络 孪生网络 特征融合 掩码细化
分 类 号:TN911.73[电子电信—通信与信息系统] TP391.41[电子电信—信息与通信工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28