检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李岩 张敏艺 宿汉辰 李芳芳[3] 李斌阳[1] LI Yan;ZHANG Minyi;SU Hanchen;LI Fangfang;LI Binyang(School of Cyber Science and Engineering,University of International Relations,Beijing 100191,China;Advertising Institute,Communications University of China,Beijing 100024,China;School of Computer Science and Engineering,Central South University,Changsha 410083,China)
机构地区:[1]国际关系学院网络空间安全学院,北京100191 [2]中国传媒大学广告学院,北京100024 [3]中南大学计算机学院,湖南长沙410083
出 处:《无线电工程》2023年第3期501-507,共7页Radio Engineering
基 金:国家自然科学基金(61976066);北京市自然科学基金(4212031);湖南省自然科学基金(2021JJ30870);国际关系学院国家安全高精尖学科建设科研专项(2019GA43,2021GA07)。
摘 要:场景文本检索是指从场景中搜索并定位与给定文本相同或相似的文本实例。通过计算机视觉方法实现文本检索可以辅助用户在指定场景中自动找到感兴趣文本,因此被广泛应用于图像安全性审核、图书检索等领域。然而,在某些场景中文本时常呈现弯曲、压缩和拉伸等不规则形态,文本区域提取与匹配面临极大挑战。为了解决这一问题,建立了一个端到端网络模型,将不规则文本提取和跨模态相似度学习统一到一个框架内,利用学习到的相似度对检测的文本实例排序,从而实现对不规则文本的检索。在SVT,STR和CTR三个数据集的实验结果表明,与现有文本检索方法相比,提出的框架在推理速度保持3.7帧/秒的情况下平均准确率比现有最好方法提升1%~3%。为了进一步验证所提方法对于不规则文本检索的有效性,建立了一个新的不规则文本数据集AIDATA,并与STR-TDSL方法进行对比实验,结果表明,在推理速度降低不到20%的情况下可以将平均准确率提升25%以上。Scene text retrieval refers to search for text instances to a given text in a particular scene in order to help users find the text they are interested.This technology is very important in product image retrieval,book retrieval and other applications.However,text in scenes often presents irregular shapes such as bending,compression and stretching,which makes the extraction and matching of text regions face great challenges.In order to solve this problem,an end-to-end network model is established,which jointly optimizes the scene irregular text detection and cross-modal similarity learning,and uses the learned similarity to order detected text instances to achieve better retrieval results.The experimental results on the three datasets of SVT,STR and CTR show that,the framework proposed has an average accuracy improvement of 1%~3%compared with the existing best text retrieval methods while the inference speed is kept at 3.7 frame per second.In order to further verify the effectiveness of this method for irregular text retrieval,a new irregular text dataset AIDATA is established and compared with the STR-TDSL method.The results show that the average accuracy can be improved by more than 25%with the inference speed reduced by less than 20%.
关 键 词:场景文本检索 端到端训练 不规则文本 相似度学习
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.236