基于数据增强和ViT的印章识别方法研究  被引量:1

A Study on Seal Recognition Method Based on Data Augmentation and Vision Transformer

在线阅读下载全文

作  者:张志剑 夏苏迪 刘政昊 王文慧 陈帅朴 霍朝光 Zhang Zhijian;Xia Sudi;Liu Zhenghao;Wang Wenhui;Chen Shuaipu;and Huo Chaoguang(School of Information Management,Wuhan University,Wuhan 430072;Big Data Institute,Wuhan University,Wuhan 430072;The Center for Studies of Information Resources,Wuhan University,Wuhan 430072;School of Health Economics and Management,Nanjing University of Chinese Medicine,Nanjing 210023;School of Information Resource Management,Renmin University of China,Beijing 100872)

机构地区:[1]武汉大学信息管理学院,武汉430072 [2]武汉大学大数据研究院,武汉430072 [3]武汉大学信息资源研究中心,武汉430072 [4]南京中医药大学卫生经济管理学院,南京210023 [5]中国人民大学信息资源管理学院,北京100872

出  处:《情报学报》2024年第3期327-338,共12页Journal of the China Society for Scientific and Technical Information

基  金:国家社会科学基金“加快构建中国特色哲学社会科学学科体系、学术体系、话语体系”研究专项项目“新时代中国特色图情学基本理论问题研究”(19VXK09)。

摘  要:印章识别因采集标注困难和印章图像退化等导致识别难度较大。数据增强可以缓解数据缺乏的困境,结合ViT(vision transformer)模型提取印章的全局特征,可以提高复杂情境下的印章识别能力。首先根据印章所处的情境特点进行分析,针对分析结果制定数据增强策略,进而扩充训练集;然后将印章图像输入ViT模型中,进行特征提取和印章识别。本文采集并标注《兰亭序》等16幅书法字画上包含的1259枚印章,经过11个数据增强模块处理后,训练集包含127159枚印章图像。与基线模型ResNet50相比,ViT模型的F1值提高了12.17个百分点,去除数据增强所得扩展数据后,所有模型均无法收敛。在标注数据较少的情况下,通过数据增强和ViT模型可以对印章图像进行准确识别。本文方法尚缺少语义推理能力,无法识别训练集中未出现的印章。Seal recognition poses challenges due to difficulties in data collection,annotation,and image degradation.This study aims to alleviate data scarcity through data augmentation and improve the model's ability to recognize seals in complex scenarios by using the vision transformer(ViT)model to extract global features.First,the contextual characteristics of the seals are analyzed,implementing data augmentation strategies based on the analysis results to expand the training set.Seal images are then input into the ViT model for feature extraction and recognition.We collected and annotated 1,259 seals from 16 calligraphy and painting works,such as“Lanting Xu.”After applying 11 data augmentation modules,the training set expanded to include 127,159 seal images.Compared with the baseline model ResNet50,the F1 score improved by 12.17%.When the extended data obtained through data augmentation is removed,all models fail to converge.However,the proposed method lacks semantic reasoning ability and cannot recognize seals not present in the training set.In scenarios with limited annotated data,the combination of data augmentation techniques and the utilization of the ViT model can facilitate accurate seal image recognition.

关 键 词:印章识别 深度学习 数据增强 数字人文 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象