基于数据合成和度量学习的台标检测与识别

TV Logo Detection and Recognition Based on Data Synthesis and Metric Learning

作　　者：张广朋张冬明[2] 张菁王川宁王立冬邹学强[2] ZHANG Guang-Peng;ZHANG Dong-Ming;ZHANG Jing;WANG Chuan-Ning;WANG Li-Dong;ZOU Xue-Qiang(Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China;National Computer Network Emergency Response Technical Team/Coordination Center of China,Beijing 100029,China;Beijing Radio&Television Station,Beijing 100022,China)

机构地区：[1]北京工业大学信息学部,北京100124 [2]国家计算机网络应急技术处理协调中心,北京100029 [3]北京广播电视台,北京100022

出　　处：《软件学报》2022年第9期3180-3194,共15页Journal of Software

基　　金：国家重点研发计划(2018YFB080402);国家自然科学基金(61672495,61971016);北京市自然科学基金-市教委联合资助项目(KZ201910005007)。

摘　　要：台标是视频的重要语义信息,其检测与识别面临类别多、结构复杂、区域小、信息量低、背景干扰大等难题.为提高模型的泛化能力,提出将台标图像叠加到背景图像中合成台标数据,来构建训练数据集.进一步,提出两阶段可伸缩台标检测与识别(scalable logo detection and recognition,SLDR)方法,其采用batch-hard度量学习方法快速训练匹配模型,确定台标类别.SLDR的检测与识别分离机制使得其可将检测目标扩展到未知类别.实验结果表明,合成数据可以有效提升模型的泛化能力和检测精度.实验亦显示SLDR方法在不更新检测模型的情况下,即可获得与端到端模型相当的精度.A TV logo represents important semantic information of videos. However, its detection and recognition are faced with many problems, including varied categories, complex structures, limited areas, low information content, and severe background disturbance. To improve the generalization ability of the detection model, this study proposes synthesizing TV logo data to construct a training dataset by superimposing TV logo images on background images. Further, a two-stage scalable logo detection and recognition(SLDR) method is put forward, which uses the batch-hard metric learning method to rapidly train the matching model and determine the category of TV logos. In addition, the detection targets can be expanded to unknown categories due to the separation mechanism of detection and recognition in SLDR. The experimental results reveal that synthetic data can effectively improve the generalization ability and detection precision of models, and the SLDR method can achieve comparable precision with the end-to-end model without updating the detection model.

关键词：数据合成度量学习可伸缩台标检测和识别

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于数据合成和度量学习的台标检测与识别

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于数据合成和度量学习的台标检测与识别

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索