行列式点过程采样的文本生成图像方法

Determinant Point Process Sampling Method for Text-to-Image Generation

作　　者：李晓霖李刚[1,2] 张恩琪顾广华 LI Xiaolin;LI Gang;ZHANG Enqi;GU Guanghua(Department of Information Science and Engineering,Yanshan University,Qinhuangdao 066004,China;Hebei Key Laboratory of Information Transmission and Signal Processing,Qinhuangdao 066004,China)

机构地区：[1]燕山大学信息科学与工程学院,河北秦皇岛066004 [2]河北省信息传输与信号处理重点实验室,河北秦皇岛066004

出　　处：《武汉大学学报（信息科学版）》2024年第2期246-255,共10页Geomatics and Information Science of Wuhan University

基　　金：国家自然科学基金(62072394);河北省自然科学基金(F2021203019)。

摘　　要：近年来,虽然基于生成对抗网络(generative adversarial networks,GAN)的文本生成图像问题取得了很大的突破,它可以根据文本的语义信息生成相应的图像,但是生成的图像结果通常缺乏具体的纹理细节,并且经常出现模式崩塌、缺乏多样性等问题。针对以上问题,提出一种针对生成对抗网络的行列式点过程(determinant point process for generative adversarial networks,GAN-DPP)方法来提高模型生成样本的质量,并使用StackGAN++、ControlGAN两种基线模型对GAN-DPP进行实现。在训练过程中,该方法使用行列式点过程核矩阵对真实数据和合成数据的多样性进行建模,并通过引入无监督惩罚损失来鼓励生成器生成与真实数据相似的多样性数据,从而提高生成样本的清晰度及多样性,减轻模型崩塌等问题,并且无需增加额外的训练过程。在CUB和Oxford-102数据集上,通过Inception Score、Fréchet Inception Distance分数、Human Rank这3种指标的定量评估,证明了GAN-DPP对生成图像多样性与质量提升的有效性。同时通过定性的可视化比较,证明使用GAN-DPP的模型生成的图像纹理细节更加丰富,多样性显著提高。Objectives:In recent years,a great breakthrough has been made in the text generation image problem based on generative adversarial networks(GAN).It can generate corresponding images based on the semantic information of the text,and has great application value.However,the current generated image results usually lack specific texture details,and often have problems such as collapsed modes and lack of diversity.Methods:This paper proposes a determinant point process for generative adversarial networks(GAN-DPP)to improve the quality of the generated samples,and uses two baseline models,Stack-GAN++and ControlGAN,to implement GAN-DPP.During the training,it uses determinantal point process kernel to model the diversity of real data and synthetic data and encourages the generator to generate diversity data similar to the real data through penalty loss.It improves the clarity and diversity of generated samples,and reduces problems such as mode collapse.No extra calculations were added during training.Results:This paper compares the generated results through indicators.For the inception score,a high value indicates that the image clarity and diversity have improved.On the Oxford-102 dataset,the score of GAN-DPP-S is increased by 3.1%compared with StackGAN++,and the score of GAN-DPPC is 3.4%higher than that of ControlGAN.For the CUB dataset,the score of GAN-DPP-S increased by 8.2%,and the score of GAN-DPP-C increased by 1.9%.For the Fréchet Inception Distance score,the lower the value,the better the quality of image generation.On the Oxford-102 dataset,the score of GANDPP-S is reduced by 11.1%,and the score of GAN-DPP-C is reduced by 11.2%.For the CUB dataset,the score of GAN-DPP-S is reduced by 6.4%,and the score of GAN-DPP-C is reduced by 3.1%.Con⁃clusions:The qualitative and quantitative comparative experiments prove that the proposed GAN-DPP method improves the performance of the generative confrontation network model.The image texture details generated by the model are more abundant,and the diversity is significantl

关键词：生成对抗网络文本生成图像行列式点过程模型崩塌多样性

分类号：P237[天文地球—摄影测量与遥感]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

行列式点过程采样的文本生成图像方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

行列式点过程采样的文本生成图像方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索