基于图文混排的传统服饰图像以文标图算法  被引量:4

A method of automatic image annotation for image-text mixed domain books

在线阅读下载全文

作  者:赵海英 高子惠 邓恋 侯小刚 李宁[1] ZHAO Hai-ying;GAO Zi-hui;DENG Lian;HOU Xiao-gang;LI Ning(School of Artificial Intelligence,Beijing University of Posts and Telecommunications,Beijing 100876,China;School of Digital Media and Design Arts,Beijing University of Posts and Telecommunications,Beijing 100876,China)

机构地区:[1]北京邮电大学人工智能学院,北京100876 [2]北京邮电大学数字媒体与设计艺术学院,北京100876

出  处:《图学学报》2021年第3期398-405,共8页Journal of Graphics

基  金:北京邮电大学基本科研业务费科研项目(2020RC26)。

摘  要:针对高效解读和智能处理海量图文资料是一项极具挑战并具有实用价值工作,而自动标注精度又面临依赖训练样本的难题,提出了一种基于数字图文混排书籍以文标图方法,由混排版式识别预处理、领域图像语义标签构建和大标签空间以文标图算法3部分组成。首先,通过提出的混排版式识别离算法,提取数字图文混排版式中图像、标题及描述文本等内容。然后,基于数字服饰图像语义标签,建立传统文化领域词库(PatternNet),最后针对领域词库标签空间特点,提出一种改进大标签空间的以文标图算法,并在服饰类图文混排书籍上进行仿真实验,通过对比其他数据集,验证了该算法的实效性。Efficient interpretation and intelligent processing of massive text and text data is a very challenging and practical work,but the accuracy of automatic labeling is highly dependent on the quality and quantity of training samples.In this paper,an image annotation method of images and text data mixed information is proposed.The method consists of three parts:adaptive image and text separation preprocessing,domain image semantic label construction and text-based image annotation algorithm.Firstly,the proposed hybrid layout recognition algorithm is used to extract the image,title and description text in the hybrid layout of images and text data.Then,the Traditional Cultural Domain Lexicon(PatternNet)is established based on semantic tags of digital clothing image.Finally,according to the characteristics of domain lexicon's tag space,a text-based image annotation algorithm is proposed to improve the large tag space.The simulation experiment is carried out on the ethnic costumes books that images and text data hybrid layout,also compared with other data sets.The experimental results verify the effectiveness of the algorithm proposed in this paper.

关 键 词:以文标图 图像标注 图文混排处理 领域关键词提取 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象