检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:高欣宇 杜方 宋丽娟 GAO Xinyu;DU Fang;SONG Lijuan(School of Information Engineering,Ningxia University,Yinchuan 750021,China;Ningxia Key Laboratory of Artificial Intelligence and Information Security for Channeling Computing Resources from the East to the West,Yinchuan 750021,China;Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education,Yinchuan 750021,China)
机构地区:[1]宁夏大学信息工程学院,银川750021 [2]宁夏“东数西算”人工智能与信息安全重点实验室,银川750021 [3]宁夏大数据与人工智能省部共建协同创新中心,银川750021
出 处:《计算机工程与应用》2024年第24期44-64,共21页Computer Engineering and Applications
基 金:国家自然科学基金(62062058);宁夏重点研发项目(2023BEG02009)。
摘 要:随着深度学习的不断发展,人工智能生成内容成为了一个热门话题,特别是扩散模型作为一种新兴的生成模型,在文本图像生成领域取得了显著进展。全面描述了扩散模型在文本图像生成任务中的应用,并与生成对抗网络和自回归模型的对比分析,揭示了扩散模型的优势和局限性。同时深入探讨了扩散模型在提升图像质量、优化模型效率以及多语言文本图像生成方面的具体方法,通过在CUB、COCO和T2I-CompBench数据集上进行了实验分析,不仅验证了扩散模型零样本生成的能力,还凸显了其根据复杂文本提示生成高质量图像的能力。介绍了扩散模型在文本图像编辑、3D生成、视频及医学图像生成等领域的应用前景。总结了扩散模型在文本图像生成任务上面临的挑战以及未来的发展趋势,有助于研究者更深入地推进这一领域的研究。With the continuous development of deep learning,artificial intelligence generated content has become a hot topic,especially diffusion models,as an emerging generation model,have made significant progress in the field of text-to-image generation.This article comprehensively describes the application of diffusion models in text and image generation tasks,and compares them with generative adversarial networks and autoregressive models,revealing the advantages and limitations of diffusion models.Meanwhile,it delves into the specific methods of diffusion models in improving image quality,optimizing model efficiency and generating images from multilingual text prompts.Experimental analyses on CUB,COCO and T2I-CompBench datasets not only validates the zero-shot generation capability of diffusion models but also highlights their ability to generate high-quality images based on complex text prompts.The paper introduces the promising applications of diffusion models in fields such as text-guided image editing,3D generation,video generation,and medical image generation.It summarizes the challenges faced by diffusion models in text-to-image generation tasks and their future development trends,aiming to facilitate further research in this domain.
关 键 词:文本图像生成 扩散模型 生成对抗网络 自回归模型
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7