多尺度特征融合的版面分析方法  

Layout Analysis Method of Multi-scale Feature Fusion

在线阅读下载全文

作  者:乔佳 徐琨[1] 胡佩蓉 QIAO Jia;XU Kun;HU Peirong(School of Information Engineering,Chang’an University,Xi’an 710018,China)

机构地区:[1]长安大学信息工程学院,陕西西安710018

出  处:《计算机与现代化》2024年第5期16-21,共6页Computer and Modernization

基  金:国家自然科学基金资助项目(52172302);国家重点研发计划项目(2019YFB1600103);陕西省重点研发计划项目(2018ZDXM-GY-044)。

摘  要:针对当前文档版面元素分析中存在的列表和文本错分,表格内小尺度文本难以识别以及空间特征保留效果差等问题,本文基于自底向上的思想,提出一种基于SegNet网络的多特征融合版面分析方法。本文方法在SegNet中引入MSCAN-SE模块,针对表格中的小尺度元素识别率低的问题,利用注意力机制MSCAN-SE中的条状特征来提升模型多尺度特征的提取能力,使得网络能够保留更多尺度的特征信息;针对列表元素和文本元素特征过于相似的问题,通过注意力机制MSCAN-SE中的空洞卷积以及通道注意力分支来扩大网络在特征提取过程的感受野。本文方法与经典的语义分割网络通过实验进行性能比较,结果表明:本文方法在版面分析的测试集上的像素准确率为97.9%,平均交并比为91.7%,平均交并比较U-Net语义分割模型、FCN语义分割模型、DeepLabV3+语义分割模型和SegNet语义分割模型分别提高了7.6%、2.4%、2.6%和1.5%。Aiming at the problems of list and text misclassification,the difficulty of recognizing small-scale text in tables,and the poor preservation of spatial features in the current document layout element analysis,according to bottom-up thinking,the paper proposes a multi-feature fusion layout analysis method based on SegNet network.In this paper,the MSCAN-SE module is introduced into SegNet to solve the problem of low recognition rate of small-scale elements in tables.The strip features in the attention mechanism MSCAN-SE are used to improve the extraction ability of multi-scale features of the model,so that the network can retain feature information of more scales.Aiming at the problem that the features of list elements and text elements are too similar,the receptive field of the network in the feature extraction process is expanded through the dilated convolution and channel attention branch in the attention mechanism MSCAN-SE.The performance of the proposed method is compared with the classical semantic segmentation network through experiments.The results show that the pixel accuracy of the proposed method on the test set of layout analysis is 97.9%,and the mean intersection over union ratio is 91.7%.Compared with U-Net semantic segmentation model,FCN semantic segmentation model,DeepLabV3+semantic segmentation model,and SegNet semantic segmentation model,the mean intersection and union ratio is increased by 7.6%,2.4%,2.6%and 1.5%respectively.

关 键 词:版面分析 多尺度注意力 语义分割 通道注意力 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象