基于双向长短期记忆网络与稀疏自注意力的票据文本识别方法  

Bill Text Recognition Based on Bidirectional Long Short-Term Memory and Sparse Self-Attention Mechanism

在线阅读下载全文

作  者:冯宪伟[1] 姚炜[1] FENG Xianwei;YAO Wei(College of Digital Commerce,Jiangsu Vocational Institute of Commerce,Nangjing Jiangsu 211168,China)

机构地区:[1]江苏经贸职业技术学院数字商务学院,江苏南京211168

出  处:《传感技术学报》2024年第11期1946-1951,共6页Chinese Journal of Sensors and Actuators

基  金:2024年度江苏省教育科学规划重点课题资助项目(B-a/2024/14);江苏高校“青蓝工程”项目(苏教师函[2022]29号);江苏经贸职业技术学院“领军人才”项目(经贸人[2021]28号)。

摘  要:提出了一种基于双向长短期记忆网络(BiLSTM)与稀疏自注意力机制的票据文本识别方法。针对票据文本识别中面临的复杂布局、多变字体及背景噪声干扰等挑战,采用深度卷积神经网络进行预处理,准确提取文本区域,并将图像数据转换为序列数据输入到BiLSTM模型中。BiLSTM通过其双向结构,能够同时捕捉文本序列中的前向和后向信息,有效提高了文本理解的准确性。为了进一步提升识别性能,引入了稀疏自注意力机制,通过计算序列中不同位置之间的相关性得分,形成稀疏的注意力矩阵,从而捕捉文本中的长距离依赖关系。这种机制不仅降低了计算复杂度,还提高了模型对关键信息的关注度。实验结果表明,所提出的票据文本识别方法在处理复杂票据文本时表现出色,具有较高的识别精度和效率。与传统方法相比,所提方法能够更好地适应票据文本的多样性和复杂性,并在实际应用中展现出良好的鲁棒性和泛化能力。A method for bill text recognition based on bidirectional long short-term memory(BiLSTM)and sparse self-attention mechanism is proposed.To contuer challenges such as complex layouts,variable fonts,and background noise interference in bill text recognition,deep convolutional neural networks are adopted for preprocessing to accurately extract text regions and convert image data into sequence data,which are then input into the BiLSTM model.Through its bidirectional structure,BiLSTM is able to capture both forward and backward information in the text sequence,effectively improving the accuracy of text understanding.To further enhance recognition performance,a sparse self-attention mechanism that calculates the relevance scores between different positions in the sequence is introduced to form a sparse attention matrix,thereby capturing long-distance dependencies in the text.This mechanism not only reduces computational complexity but also enhances the model's focus on key information.Experimental results show that the proposed bill text recognition method performs exceptionally well in processing complex bill texts,achieving high recognition accuracy and efficiency.Compared with traditional methods,the proposed method can better adapt to the diversity and complexity of bill texts and demonstrates good robustness and generalization ability in practical applications.

关 键 词:稀疏注意力机制 双向长短期记忆网络 票据文本识别 光学字符识别 

分 类 号:TN911.73[电子电信—通信与信息系统] TP183[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象