基于改进YOLOv5s的文档图像版面分析算法

Document Image Layout Analysis Algorithm Based on Improved YOLOv5s

作　　者：尹玲李家乐黄勃[1] YIN Ling;LI Jiale;HUANG Bo(School of Electronic and Electrical Engineering,Shanghai University of Engineering Science,Shanghai 201600,China)

机构地区：[1]上海工程技术大学电子电气工程学院,上海201600

出　　处：《软件导刊》2025年第2期146-154,共9页Software Guide

基　　金：国家自然科学基金青年基金项目(61802251)。

摘　　要：针对当前基于深度学习的版面分析方法存在效率低和训练成本高的问题,提出一种基于YOLOv5s改进的单阶段目标检测网络RCW-YOLO,并将其应用于文档图像版面分析任务。首先,通过Res2Net模块改进YOLOv5s中的C3模块,有效增强网络对文档图像多尺度特征的提取能力;其次,引入轻量级上采样算子CARAFE以优化特征融合网络,减少上采样过程中的信息丢失;最后,引入WIoUv3作为边界框回归损失函数,制定合适的梯度权益分配策略,以提升模型泛化能力和整体性能。实验结果表明,在CDLA、IIIT-AR-13K和PubLayNet数据集上,RCW-YOLO在mAP@0.50:0.95指标上分别达到了87.2%、76.4%和94.5%,优于现有的两阶段算法和其他单阶段算法,同时具有更低的计算量、参数量和更快的推断效率。Addressing the issues of low efficiency and high training costs in current deep learning based layout analysis methods,this paper proposes a single-stage tar-get detection network RCW-YOLO,which is improved based on YOLOv5s.Firstly,by improving the C3 module in YOLOv5s with the Res2Net module,the network's ability of extracting multi-scale features from document images is improved.Secondly,the lightweight up-sampling operator CARAFE is used to optimize the feature fusion network,and to reduce information loss during the upsampling process.Finally,WIoUv3 is adopted as the bounding box regression loss function,assigning more attention weights to samples of average quality to improve the model's generalization ability and overall performance.Experimental results show that RCW-YOLO achieves 87.2%,76.4%,and 94.5%in mAP@0.50:0.95 on the CDLA,IIIT-AR-13K,and PubLayNet datasets,respectively.Compared with existing two-stage and other single-stage algorithms,RCW-YOLO has lower computational complexity and parameter count while maintaining excellent accuracy.

关键词：文档图像版面分析目标检测 YOLOv5s 多尺度特征提取

分类号：TP391.41[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于改进YOLOv5s的文档图像版面分析算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于改进YOLOv5s的文档图像版面分析算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索