结合程序内容生成与扩散模型的图像到三维瓷瓶生成技术  

Image to 3D vase generation technology combining procedural content generation and diffusion models

在线阅读下载全文

作  者:孙禾衣 李艺潇 田希 张松海[2] SUN Heyi;LI Yixiao;TIAN Xi;ZHANG Songhai(Zhili College,Tsinghua University,Beijing 100084,China;Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China;Department of Computer Science,University of Bath,Somerset 133789,UK)

机构地区:[1]清华大学致理书院,北京100084 [2]清华大学计算机科学与技术系,北京100084 [3]英国巴斯大学计算机科学系,萨默塞特巴斯133789

出  处:《图学学报》2025年第2期332-344,共13页Journal of Graphics

摘  要:在传统手工三维内容制作中,三维网格和纹理是构建三维资产的基础。为了提升三维资产的视觉表现和渲染性能,三维网格通常采用四边面构建,并需具备良好的拓扑结构和合理的UV映射;三维纹理需要与几何形状相匹配,并保持良好的全局一致性。然而,当前基于潜在扩散模型的三维内容生成技术尚且未能满足这些标准,限制了其在实际应用中的潜力。与此同时,程序内容生成技术因其能够根据规则创建大量符合行业最佳实践的三维资产,在游戏和建筑行业中得到了广泛应用。为了提升生成资产的可用性,提出了一种结合程序内容生成与扩散模型技术的综合解决方案。以三维旋转体中具体的瓷瓶对象为例,将图像到三维资产的生成问题细分为2个主要任务:三维网格重建和三维纹理生成。在三维网格重建方面,创建了一个新颖的瓷瓶生成程序,并训练深度神经网络学习图像特征与程序参数之间的映射关系,从而实现二维图像到三维模型的重建;在三维纹理生成方面,提出了一种新颖的两段式纹理生成策略,该策略结合了多视角图像生成和多视角一致性采样技术的优势,可以生成具有全局一致性的高清纹理贴图。总结而言,提出了一种可以基于图像自动构建三维瓷瓶资产的方案,此方案可以推广到其他三维旋转体内容的生成,并有望应用于其他品类的三维内容生成。In the traditional manual production of 3D content,3D meshes and textures serve as the foundational elements in constructing 3D assets.To enhance the visual representation and rendering performance of 3D assets,the meshes are typically constructed using quadrilateral faces,requiring optimal topology and UV mapping.Moreover,3D textures must be congruent with the geometric shape and maintain global consistency.However,current 3D content generation technologies based on latent diffusion models fail to meet these standards,limiting their potential in practical applications.At the same time,procedural content generation techniques have gained widespread application in the gaming and architectural industries due to their ability to systematically produce a vast array of 3D assets that conform to industry best practices.To improve the usability of generated assets,an integrated solution combining procedural content generation with diffusion model techniques was proposed.Using the 3D rotational body example of a vase,the image-to-3D asset generation problem was divided into two principal tasks:3D mesh reconstruction and 3D texture generation.In the domain of 3D mesh reconstruction,a novel vase generation program was developed,and a deep neural network was trained to learn the mapping between image features and procedural parameters,thereby facilitating the reconstruction from a 2D image to a 3D model.For3D texture generation,a novel two-stage texturing strategy was introduced,combining multi-view image synthesis and multi-view consistency sampling techniques to produce high quality texture maps with global coherence.In summary,a scheme for the automatic construction of 3D vase assets from images was presented,which can be generalized to generate other 3D rotational body content and holds promise for applications in generating other types of 3D content.

关 键 词:扩散模型 程序内容生成 三维重建 纹理生成 深度学习 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象