基于MD-CycleGAN的手写表达式图像识别算法研究  

Research on handwritten expressions image recognitionalgorithm based on MD-CycleGAN

在线阅读下载全文

作  者:吕闯[1] 水卿梅[1] LYU Chuang;SHUI Qingmei(Pass College of Chongqing Technology and Business University,Chongqing 401520,China)

机构地区:[1]重庆工商大学派斯学院,重庆401520

出  处:《激光杂志》2024年第8期169-174,共6页Laser Journal

基  金:国家自然科学基金(No.62302475)。

摘  要:针对使用生成对抗网络生成图像时词向量或者字符向量难以重建数学表达式中的二维结构的问题,将手写数学表达式的图像生成任务转换为印刷体数学表达式到手写体数学表达式的风格转换问题,并自建了一个带有手写风格分类的数据集来训练风格转换模型。为了解决CycleGAN网络生成的图像内容不全、细节失真、质量不高的问题,设计了一种多尺度判别循环一致性生成对抗网络MD-CycleGAN,引入了CBAM注意力机制,弥补下采样环节信息丢失的问题,引入ACON激活函数代替ReLU激活函数,通过自适应学习控制网络每一层的非线性程度。实验结果表明基于生成对抗网络的数据增强方法能有效降低模型过拟合的程度。本研究为手写数学表达式图像的自动识别提供了一种新的方法,克服了数据标注问题和模型泛化问题,具有广泛的应用潜力,包括数学教育、科学文档处理和数学搜索引擎等领域。To address the problem that word vectors or character vectors are difficult to reconstruct the two-dimensional structure in mathematical expressions when using generative adversarial networks to generate images,the task of generating images with handwritten mathematical expressions is converted into a style conversion problem from printed mathematical expressions to handwritten mathematical expressions,and a self-constructed dataset with handwritten style categorization is used to train the style conversion model.In order to solve the problem of incomplete content,distorted details and low quality of images generated by CycleGAN network,a multi-scale discriminative cyclic consistency generative adversarial network MD-CycleGAN is designed,which introduces the CBAM attention mechanism to compensate for the loss of information in the downsampling link,introduces the ACON activation function instead of the ReLU activation function,and controls the network through adaptive learning nonlinearity degree of each layer.The experimental results show that the data enhancement method based on generative adversarial network in this paper can effectively reduce the degree of model overfitting.This study provides a new method for automatic recognition of handwritten mathematical expression images,which overcomes the data annotation problem and the model generalization problem,and has the potential for a wide range of applications,including the fields of mathematics education,scientific document processing,and mathematical search engines.

关 键 词:MD-CycleGAN 手写数学表达式 图像识别 神经网络 

分 类 号:TN249[电子电信—物理电子学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象