基于Transformer的图像抠图模型研究  

Image Matting Method Based on Transformer

在线阅读下载全文

作  者:曾碧凝 王国栋 ZENG Bining;WANG Guodong(College of Computer Science&Technology,Qingdao University,Qingdao 266071,China)

机构地区:[1]青岛大学计算机科学技术学院,山东青岛266071

出  处:《青岛大学学报(工程技术版)》2023年第3期9-15,共7页Journal of Qingdao University(Engineering & Technology Edition)

基  金:山东省自然科学基金资助项目(ZR2019MF050)。

摘  要:针对图像抠图模型存在的体积大以及生成结果精度低的问题,本文基于Transformer对图像抠图模型进行研究。以简单的非参数运算及高效的傅里叶变换为特征混合器,消除Transformer的体积弊端。此外,为处理因维度变换而频繁使用的重塑操作对速度的减缓,将编码器设计为由高效抠图(efficient matting,EM)块和补丁嵌入块堆叠而成的局部尺寸一致结构。同时,为证明本方法的高效性,将本方法与最先进的模型在Composition-1k数据集上,进行对比分析。分析结果表明,与提出Composition-1k数据集的Deep Image Matting模型相比,本模型的均方误差(mean square error,MSE)降低了9.0×10^(-3),绝对误差和(sum of absolute difference,SAD)降低了23.1。与基线模型MG Matting模型相比,本模型的参数量和浮点运算次数(floating-point operations per second,FLOPs)成倍下降,证明本方法性能较高,有效解决了图像抠图问题。该研究具有广阔的应用前景。Aiming at the problems of large volume and low accuracy of image matting model,this paper studies the image matting model based on Transformer.Simple nonparametric operation and efficient Fourier transform are used as feature mixers to eliminate the volume drawbacks of Transformer.In order to solve the slowdown caused by the reshaping operation that is frequently used due to dimensional transformation,the encoder is designed as a locally dimensionally consistent structure composed of EfficientMatting(EM)blocks and patch embedding blocks.At the same time,in order to prove the efficiency of the proposed method,it is compared with the most advanced model on the Composition-1k dataset.The analysis results show that compared with the Deep Image Matting[5]model that proposed Composition-1K dataset,the mean square error(MSE)of this model is reduced by 9.0×10^(-3),and the sum of absolute difference(SAD)is reduced by 23.1.Compared with the baseline model MG Matting[6],the number of parameters and floating-point operations per second(FLOPs)of this model are reduced exponentially.It shows that this method achieves high performance,effectively solves the problem of image matting,and has broad application prospects.

关 键 词:图像抠图 TRANSFORMER 傅里叶变换 局部维度一致 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象