基于多文本描述的图像生成方法  被引量:1

Image synthesis method based on multiple text description

在线阅读下载全文

作  者:聂开琴 倪郑威 NIE Kaiqin;NI Zhengwei(College of Information and Electronic Engineering,Zhejiang Gongshang University,Hangzhou 310018,China)

机构地区:[1]浙江工商大学信息与电子工程学院,浙江杭州310018

出  处:《电信科学》2024年第5期73-85,共13页Telecommunications Science

基  金:浙江省自然科学基金资助项目(No.LQ22F010008)。

摘  要:针对单条文本描述生成的图像质量不高且存在结构错误的问题进行研究,采用多阶段生成对抗网络模型,并提出对不同文本序列进行插值操作,从多条文本描述中提取特征,以丰富给定的文本描述,使生成图像具有更多细节。为了生成与文本更为相关的图像,引入了多文本深度注意多模态相似度模型以得到注意力特征,并与上一层视觉特征联合作为下一层的输入,从而提升生成图像的真实程度和文本描述之间的语义一致性。为了能够让模型学会协调每个位置的细节,引入了自注意力机制,让生成器生成更加符合真实场景的图像。优化后的模型在CUB和MS-COCO数据集上进行验证,生成的图像不仅结构完整,语义一致性更强,视觉上的效果更加丰富多样。Aiming at the challenges associates with the low quality and structural errors existed in the images generated by a single text description,a multi-stage generative adversarial network model was used to study,and it was proposed to interpolate different text sequences to enrich the given text descriptions by extracting features from multiple text descriptions and imparting greater detail to the generated images.In order to enhance the correlation between the generated images and the corresponding text,a multi-captions deep attentional multi-modal similarity model that captured attention features was introduced.These features were subsequently integrated with visual features from the preceding layer,serving as input for the subsequent layer.This integration improved the realism of the generated images and enhanced their semantic consistency with the text descriptions.In addition,a self-attention mechanism to enable the model to effectively coordinate the details at each position was incorporated,resulting in images that were more aligned with real-world scenarios.The optimized model was verified on the CUB and MS-COCO datasets,demonstrating the generation of images with intact structures,stronger semantic consistency,and richer visual diversity.

关 键 词:文本生成图像 生成对抗网络 计算机视觉 语义一致性 自注意力 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象