发掘和利用:细粒度层次化网络的文本到图像生成  

Exploration and exploitation:a fine-grained hierarchical network for text-to-image synthesis

在线阅读下载全文

作  者:申恒涛 赵启轲 朱俊臣 高联丽 陈岱渊 宋井宽 SHEN Hengtao;ZHAO Qike;ZHU Junchen;GAO Lianli;CHEN Daiyuan;SONG Jingkuan(School of Computer Science and Engineering,University of Electronic Science and Technologyof China,Chengdu 611731,China;Zhejiang Lab,Hangzhou 311121,China)

机构地区:[1]电子科技大学计算机科学与工程学院,成都611731 [2]之江实验室,杭州311121

出  处:《中国科技论文》2023年第3期238-244,共7页China Sciencepaper

基  金:之江实验室开放课题资助项目(2019KD0AD01/011)。

摘  要:针对现有文本到图像生成(text-to-image synthesis,T2I)方法采用冗余的阶段性网络结构,同时缺乏对文本特性有效利用从而影响网络完全收敛的问题,提出了一种细粒度的层次化生成对抗网络(generative adversarial networks,GAN)。该网络利用多维度文本特征提取器充分地“发掘”(explore)文本语义特征;通过堆叠层次化模块,即空间仿射生成模块和累加结合模块,更好地“利用”(exploit)主干网络的生成性能。在3个基准数据集上的实验充分表明,所提方法在量化指标和可视化效果方面均显著领先于现有方法。实现代码已经公开在https:∥github.com/qikizh/EE-GAN。Text-to-image synthesis(T2I)is to generate high-resolution images with photo-realistic details with conditioned on descriptions.Current works adopting stage-to-stage structures lack effective employments of text feature,which hinders the adequate convergence of network.To solve these problems,a fine-grained hierarchical generative adversarial networks(GAN)was introduced for T2I in this study.The proposed method preferably“explores”the linguistic semantic features by a multi-level text encoder,and then“exploits”the generation ability of backbone networks better by stacking hierarchical modules,i.e.,spatial affine generative blocks and cumulative blocks.Furthermore,extensive experiments on three benchmark datasets demonstrate that the proposed method significantly surpasses state-of-the-art methods quantitatively and qualitatively.The code link has been published on https:∥github.com/qikizh/EE-GAN.

关 键 词:跨模态生成 文本到图像生成 生成对抗网络 层次化网络 多维度文本特征提取器 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象