检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:赵泽纬 车进 吕文涵 ZHAO Zewei;CHE Jin;LÜWenhan(School of Physics and Electronic and Electrical Engineering,Ningxia University,Yinchuan 750021,China)
机构地区:[1]宁夏大学物理与电子电气工程学院,宁夏银川750021
出 处:《液晶与显示》2024年第2期168-179,共12页Chinese Journal of Liquid Crystals and Displays
基 金:国家自然科学基金(No.61861037)。
摘 要:针对文本生成图像任务中的文本编码器不能深度挖掘文本信息,导致后续生成的图像存在语义不一致的问题,本文提出了一种改进DMGAN模型的文本生成图像方法。首先使用XLnet的预训练模型对文本进行编码,该模型在大规模语料库的预训练之下能够捕获大量文本的先验知识,实现对上下文信息的深度挖掘;然后在DMGAN模型生成图像的初始阶段和图像细化阶段均加入通道注意力模块,突出重要的特征通道,进一步提升生成图像的语义一致性和空间布局合理性,以及模型的收敛速度和稳定性。实验结果表明,所提出模型在CUB数据集上生成的图像相比原DMGAN模型,IS指标提升了0.47,FID指标降低了2.78,充分说明该模型具有更好的跨模态生成能力。In order to solve the problem that the text encoder cannot dig the text information deeply in the task of text image generation,which leads to the semantic inconsistency of the subsequent generated images,a text image generation method is proposed based on improved DMGAN model.Firstly,XLnet’s pre-training model is used to encode the text.This model can capture a large number of prior knowledge of the text under the pre-training of large-scale corpus,and realize the deep mining of context information.Then,the channel attention module is added to initial stage of image generation by DMGAN model and the image refinement stage to highlight important feature channels,and further improve the semantic consistency and spatial layout rationality of the generated images,as well as the convergence speed and stability of the model.Experimental results show that in comparison with original DMGAN model,the image on CUB dataset generated by the proposed model has a 0.47 increase in the IS index and a 2.78 decrease in the FID in dex,which fully indicates that the model has better cross-mode generation ability.
关 键 词:文本生成图像 XLnet模型 生成对抗网络 通道注意力
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7