基于Transformer的多任务图像拼接篡改检测算法  被引量:2

Multitask Transformer-based Network for Image Splicing Manipulation Detection

在线阅读下载全文

作  者:张婧媛 王宏霞 何沛松 ZHANG Jingyuan;WANG Hongxia;HE Peisong(School of Cyber Science and Engineering,Sichuan University,Chengdu 610065,China)

机构地区:[1]四川大学网络空间安全学院,成都610065

出  处:《计算机科学》2023年第1期114-122,共9页Computer Science

基  金:四川省科技计划(2022YFG0320);国家自然科学基金(61902263,61972269);中央高校基本科研业务费专项资金(YJ201881,2020SCU12066);中国博士后科学基金(2020M673276)。

摘  要:现有基于深度学习的图像拼接篡改检测方法大多依赖卷积操作的局部计算过程,感受野有限。此外,现有方法大多仅将篡改区域定位用于指导检测模型训练,难以学习更加丰富的篡改痕迹特征。针对上述局限性,提出了基于Transformer的多任务图像拼接篡改检测网络(Multitask Transformer-based Network,MT-Net),利用Transformer中的自注意力机制在特征提取过程获取图像像素之间的相关性,自适应地为各像素提供不同的关注度,提升检测网络对篡改痕迹的表征能力。此外,MT-Net同时考虑多个子任务从局部细化和整体感知两个方面共同引导网络学习,包括篡改区域定位、篡改边缘定位和篡改比例预测,并根据子任务特点设计了对应的损失函数来指导网络进行优化。实验结果表明,相比现有算法,所提算法在CASIA V2.0,Columbia和IDM2020这3个公开数据集上均取得了更好的检测准确性,F1值分别达到了0.808,0.913和0.675。可视化检测结果图表明,所提算法在定位拼接篡改区域时也有较好的表现。Most of existing deep learning-based methods for image splicing forgery detection use convolutional layer for forensics feature extraction.However,convolution kernel conducts the local computation process with the limited reception field.More-over,existing methods mainly apply the location of tampering regions to guide the detection model to train,and it is difficult to learn richer tamper trace features.To overcome above-mentioned limitations,a multitask transformer-based network(MT-Net)is proposed for image splicing detection and localization.The self-attention mechanism of Transformer is leveraged in encoder to learn the pixel correlation,which is able to provide different attention levels for pixels and makes the detection network pay more attention to tampering traces.Meanwhile,MT-Net considers three subtasks simultaneously to guide the detection network expose tampering traces from both local and global information,including tampered edge detection,tampered area detection and the prediction of the tampered area’s proportion.Finally,three specific loss functions for their corresponding subtask are designed to better optimize the detection network in the training phase.In experiments,the proposed method(MT-Net)achieves better detection results compared with other state-of-the-art methods on three public available datasets,including CASIA v2.0,Columbia and IDM2020,where F1 scores are 0.808,0.913 and 0.675 respectively.The visualization results also demonstrate that the proposed method has the better capability of localizing the splicing regions.

关 键 词:数字图像取证 图像拼接检测 TRANSFORMER 自注意力机制 多任务网络 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象