面向感知哈希的图像数据集  

Large-scale image dataset for perceptual hashing

在线阅读下载全文

作  者:周元鼎 房耀东 秦川[1] Zhou Yuanding;Fang Yaodong;Qin Chuan(School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China)

机构地区:[1]上海理工大学光电信息与计算机工程学院,上海200093

出  处:《中国图象图形学报》2024年第2期343-354,共12页Journal of Image and Graphics

基  金:国家自然科学基金项目(62172280,U20B2051);上海市自然科学基金项目(21ZR1444600)。

摘  要:目的 感知图像哈希又称图像摘要或是图像指纹,是一种有效的图像认证技术,近年来受到了广泛的关注。该技术通过将图像的感知鲁棒特征转化为固定长度的哈希序列,来实现图像版权认证。然而,该领域始终缺乏一个比较通用的数据集,已有数据集所使用的图像内容保留操作和真实场景差异较大,使得训练得到的神经网络架构在应对复杂的图像编辑操作时效果显著下降。方法 针对感知图像哈希任务,面向实际图像内容认证场景构建了一个新的数据集。首先,将现实中常见的图像内容保留操作进行总结和分类,设计了48种单一、复合的图像内容保留操作来生成感知相似图像;然后,根据感知图像哈希的定义,选择与待认证图像语义相似但是感知内容不同的图像作为感知不相似图像,增加了该数据集的辨别难度;最终建立了一个包含116 400幅图像的感知哈希图像数据集。结果 由于本文提出的数据集使用的图像内容保留操作更加复杂,不相似图像也更加难以辨别,使得在该数据集上训练得到的深度神经网络具有较好的泛化能力,即这些神经网络即使不进行重新训练或是微调,也可以在其他数据集上取得较好的认证性能。同时,在该数据集上训练得到的神经网络在不同数据集上性能差别较小,体现了本文数据集具有较好的稳定性。结论 设计了一个针对感知哈希的图像数据集,大量的对比实验表明了该数据集的有效性,该工作可对感知图像哈希领域的发展起到促进作用。Objective With the rapid development of social media,multimedia information on the internet is updated at anexponential rate.Obtaining and transmitting digital images have become convenient,considerably increasing the risk ofmalicious tampering and forgery of images.Accordingly,increasing attention is given to image authentication and contentprotection.Many image authentication schemes have emerged recently,such as watermarking,the use of digital signa⁃tures,and perceptual image hashing(PIH).PIH,also known as image abstract or image fingerprint,is an effective tech⁃nique for image authentication that has attracted widespread research attention in recent years.The goal of PIH is to authen⁃ticate an image by compressing perceptual robust features into a compact hash sequence with a fixed length.However,ageneral dataset in this field is lacking,and the dataset constructed using other methods have many problems.On the one hand,the types of image content-preserving manipulations used in these datasets are few and the intensity of attacks is rela⁃tively weak.On the other hand,the distinct images used in these datasets are extremely different from the images that mustbe authenticated,making it easy to distinguish them from each other.The convolutional neural networks(CNNs)trainedby these datasets have poor generalizability and can hardly cope with the complex and diverse image editing operations inreality.This important factor has limited the development of the PIH field.Method On the basis of the preceding knowl⁃edge,we propose a specialized dataset based on various manipulations in this study.This dataset can deal with compleximage authentication scenarios.The proposed dataset is divided into three subsets:original,perceptual identical,and per⁃ceptual distinct images.The latter two correspond to the robustness and discrimination of PIH,respectively.Originalimages are selected from ImageNet1K,and each of them corresponds to one category.For identical images,we summarizethe content-preserving manipulations c

关 键 词:感知图像哈希 图像认证 数据增强 数据集 内容保留操作 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象