基于空间注意力及条件增强的文本生成图像方法

Text-to-image synthesis method based on spatial attention and conditional augmentation

作　　者：马军[1,2] 车进贺愉婷[1,2] 马鹏森 MA Jun;CHE Jin;HE Yuting;MA Pengsen(School of Electronic and Electrical Engineering,Ningxia University,Yinchuan 750021,Ningxia,China;Key Laboratory of Intelligent Sensing for Desert Information,Yinchuan 750021,Ningxia,China)

机构地区：[1]宁夏大学电子与电气工程学院,宁夏银川750021 [2]宁夏沙漠信息智能感知重点实验室,宁夏银川750021

出　　处：《山东大学学报(工学版)》2024年第6期49-56,共8页Journal of Shandong University(Engineering Science)

基　　金：国家自然科学基金资助项目(61861037);宁夏大学研究生创新研究基金资助项目(CXXM202223)。

摘　　要：针对文本生成图像语义不一致、训练不稳定、生成图像单一等问题,在一种简单有效的文本生成图像基准模型上提出基于空间注意力及条件增强的文本生成图像模型。为提高训练过程的稳定性、增加生成图像的多样性,在原有模型基础上增加条件增强模型;从文本分布出发拟合图像分布,增加视觉特征的多样性,扩大表现空间,在原有的DF-Block模块中增加一层Affine仿射块。在判别器中加入空间注意力模型,提高文本与合成图像的语义一致性。试验结果表明,在CUB和Oxford-102数据集上,初始得分分别提高了2.05%和2.63%;在CUB和COCO数据集上,特征空间距离分别降低了20.73%和9.25%。本研究提出的模型生成的图像更具多样性且更接近真实图像。For the problems such as inconsistent semantics of text-to-images,unstable training,and single generated images,a text-to-images model based on spatial attention and conditional augmentation was proposed on a simple and effective text-to-images benchmark model.To improve the stability of the training process and increase the diversity of generated images,a conditional augmentation model was added on the basis of the original model;starting from the text distribution to fit the image distribution,increasing the diversity of visual features and expanding the performance space,and adding an Affine block in the original DF-Block module.A spatial attention model was added to the discriminator to improve the semantic consistency of the text and the synthetic image.The experimental results showed that on the CUB and Oxford-102 datasets,inception score increased by 2.05%and 2.63%respectively;and on the CUB and COCO datasets,Fréchrt inception distance decreased by 20.73%and 9.25%respectively.The results proved that the images generated by the proposed model were more diverse and closer to real images.

关键词：文本生成图像 DF-GAN 条件增强模型 Affine仿射块空间注意力模型

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于空间注意力及条件增强的文本生成图像方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于空间注意力及条件增强的文本生成图像方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索