检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:乔佳 徐琨[1] 胡佩蓉 QIAO Jia;XU Kun;HU Peirong(School of Information Engineering,Chang’an University,Xi’an 710018,China)
出 处:《计算机与现代化》2024年第5期16-21,共6页Computer and Modernization
基 金:国家自然科学基金资助项目(52172302);国家重点研发计划项目(2019YFB1600103);陕西省重点研发计划项目(2018ZDXM-GY-044)。
摘 要:针对当前文档版面元素分析中存在的列表和文本错分,表格内小尺度文本难以识别以及空间特征保留效果差等问题,本文基于自底向上的思想,提出一种基于SegNet网络的多特征融合版面分析方法。本文方法在SegNet中引入MSCAN-SE模块,针对表格中的小尺度元素识别率低的问题,利用注意力机制MSCAN-SE中的条状特征来提升模型多尺度特征的提取能力,使得网络能够保留更多尺度的特征信息;针对列表元素和文本元素特征过于相似的问题,通过注意力机制MSCAN-SE中的空洞卷积以及通道注意力分支来扩大网络在特征提取过程的感受野。本文方法与经典的语义分割网络通过实验进行性能比较,结果表明:本文方法在版面分析的测试集上的像素准确率为97.9%,平均交并比为91.7%,平均交并比较U-Net语义分割模型、FCN语义分割模型、DeepLabV3+语义分割模型和SegNet语义分割模型分别提高了7.6%、2.4%、2.6%和1.5%。Aiming at the problems of list and text misclassification,the difficulty of recognizing small-scale text in tables,and the poor preservation of spatial features in the current document layout element analysis,according to bottom-up thinking,the paper proposes a multi-feature fusion layout analysis method based on SegNet network.In this paper,the MSCAN-SE module is introduced into SegNet to solve the problem of low recognition rate of small-scale elements in tables.The strip features in the attention mechanism MSCAN-SE are used to improve the extraction ability of multi-scale features of the model,so that the network can retain feature information of more scales.Aiming at the problem that the features of list elements and text elements are too similar,the receptive field of the network in the feature extraction process is expanded through the dilated convolution and channel attention branch in the attention mechanism MSCAN-SE.The performance of the proposed method is compared with the classical semantic segmentation network through experiments.The results show that the pixel accuracy of the proposed method on the test set of layout analysis is 97.9%,and the mean intersection over union ratio is 91.7%.Compared with U-Net semantic segmentation model,FCN semantic segmentation model,DeepLabV3+semantic segmentation model,and SegNet semantic segmentation model,the mean intersection and union ratio is increased by 7.6%,2.4%,2.6%and 1.5%respectively.
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49