基于改进CRNN算法的专利附图标记识别  

Patent Label Recognition Based on Improved CRNN Algorithm

在线阅读下载全文

作  者:孙雪姣 肖诗斌[1,2] 都云程 SUN Xue-jiao;XIAO Shi-bin;DU Yun-cheng(School of Computer Science,Beijing Information Science and Technology University;TRS Information Technology Co.,Ltd.,Beijing 100101,China)

机构地区:[1]北京信息科技大学计算机学院 [2]拓尔思信息技术股份有限公司,北京100101

出  处:《软件导刊》2022年第12期38-45,共8页Software Guide

基  金:北京市海淀区发展和改革委员会项目(2019-2021)。

摘  要:基于深度学习技术对机械领域的专利附图进行研究,充分发掘与利用专利附图信息,寻求专利检索的补充手段,提出一种基于改进CRNN算法的专利附图标记识别方法CRNN_Eca。将特征提取的骨干网络改为ResNet34,融合ECA-Net中的ECA模块构成Eca-ResNet特征提取网络,其中的ECA模块是一种极轻量级且高效的通道注意力机制,原始图像经过Eca-ResNet网络特征提取后,经过序列转换生成对应的特征序列,通过深度双向GRU网络与CTC预测输出附图标记识别结果。该算法在附图标记的验证集与测试集上准确率分别达到了90.15%和88.27%,相比原CRNN算法提高了4.09%、4.17%,同时检测速率得到大幅提升。实验结果表明,CRNN_Eca算法可以使专利附图标记识别实现较高的识别准确率和较快的识别速度,是一种有效的专利附图标记识别算法。Research on patent drawings in the mechanical field based on deep learning technology, fully explore and use patent labels information, seek supplementary methods for patent search, and propose a patent label recognition method CRNN_Eca based on improved CRNN algorithm. Change the feature extraction backbone network to ResNet34, and integrate the ECA module in ECA-Net to form the Eca-ResNet feature extraction network. The ECA module is an extremely lightweight and efficient channel attention mechanism. The original image is extracted by the Eca-ResNet network and then undergoes sequence conversion to generate the corresponding feature sequence. The result of the reference sign recognition is predicted and output through the deep two-way GRU network and CTC. The accuracy of the algorithm on the validation set and test set with reference signs reached 90.15% and 88.27%, which were 4.09% and 4.17% higher than the original CRNN algorithm, and the detection rate was greatly improved. Experimental results show that the CRNN_Eca algorithm for patent reference signs can achieve higher recognition accuracy and faster recognition speed, which is an effective patent reference sign recognition algorithm.

关 键 词:专利附图标记 文本识别 注意力机制 自然语言处理 深度学习 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象