基于Multi-WHFPN与SimAM注意力机制的版面分割  被引量:1

Layout segmentation based on Multi-WHFPN and SimAM attention mechanism

在线阅读下载全文

作  者:杨陈慧 周小亮 张恒 孙政 业宁[1] Yang Chenhui;Zhou Xiaoliang;Zhang Heng;Sun Zheng;Ye Ning(College of Information Science and Technology,Nanjing Forestry University,Nanjing 210037,China;Nanjing Lantai Information Technology Co.,Ltd.,Nanjing 210009,China)

机构地区:[1]南京林业大学信息科学技术学院,南京210037 [2]南京兰台信息技术有限公司,南京210009

出  处:《电子测量技术》2024年第1期159-168,共10页Electronic Measurement Technology

基  金:国家重点研发计划(2016YFD0600101)项目资助。

摘  要:作为OCR的预处理工作,版面分割技术越来越受到学术界和工业界重视。针对版面分割中遇到的检测速度慢、目标区域边界不准确以及细小目标易遗漏等问题,提出了YOLOv7-MSY模型。此模型首先借鉴残差连接思想,提出了Multi-WHFPN网络结构。它采用可训练的权重参数,突出特征融合过程中特征重要性,并添加了小目标检测头,从而提升对小目标的检测性能;其次,引入SimAM注意力机制,可以在不增加额外参数的基础上在3D维度评估特征权重,以增强重要特征,抑制无效特征;最后,使用YEIOU来代替原模型中的定位损失函数,提升了模型的收敛速度与回归精度。在江苏省档案馆提供的数据集上进行实验对比,YOLOv7-MSY对目标区域边界检测更加敏感,对细小目标的检测效果更好。YOLOv7-MSY的mAP@.5达到了0.871,相较于原YOLOv7模型提高了7.84%。该模型的版面分割的效果优于其他类型的版面分割算法,具有良好的泛化性能,并且版面分割速度处于较高水平。As a pre-processing step for OCR,the layout segmentation technology is receiving increasing attention from both academic and industrial communities.To address the problems encountered in layout segmentation,such as slow detection speed,inaccurate boundary detection of target areas,and easy omission of small targets,the YOLOv7-MSY model is proposed.Firstly,the Multi-WHFPN network structure is proposed by combining the idea of residual connection,and trainable weighted parameters are introduced to highlight the importance of features and add a small target detection head to enhance small target detection.Secondly,the SimAM attention mechanism is introduced to evaluate feature weights in the 3D dimension without adding extra parameters,to enhance important features and suppress invalid features.Finally,the YEIOU is used to replace the original model's localization loss function,which improves the convergence speed and regression accuracy of the model.Experimental comparisons on the dataset provided by the Jiangsu Provincial Archives show that YOLOv7-MSY is more sensitive to boundary detection of target areas and performs better in detecting small targets.The mAP@.5 of YOLOv7-MSY reaches 0.871,which is 7.84% higher than the original YOLOv7 model.The layout segmentation effect of this model is superior to other types of layout segmentation algorithms.It has good generalization performance,and the layout segmentation speed is relatively high.

关 键 词:版面分割 YOLOv7-MSY Multi-WHFPN SimAM注意力机制 YEIOU 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象