检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:靳聪[1] 周满玲 林美秀 张佳一 王晶[3] 刘淼 JIN Cong;ZHOU Manling;LIN Meixiu;ZHANG Jiayi;WANG Jing;LIU Miao(School of Information and Communication Engineering,Communication University of China,Beijing100024,China;School of Advertising,Communication University of China,Beijing100024,China;School of In‐formation and Electronics,Beijing Institute of Technology,Beijing 100081,China)
机构地区:[1]中国传媒大学信息与通信工程学院,北京100024 [2]中国传媒大学广告学院,北京100024 [3]北京理工大学信息与电子学院,北京100081
出 处:《中国传媒大学学报(自然科学版)》2024年第4期55-63,共9页Journal of Communication University of China:Science and Technology
基 金:国家自然科学基金(62207029,62271454);北京市自然科学基金-小米联合基金(L223033);中央高校基础科研经费(CUC230B018)。
摘 要:随着VR/AR技术的迅猛发展,用户对于沉浸式体验的需求日益增长。同时,虚拟人脸技术亦趋成熟。基于此,本文探索将高度拟真的虚拟人脸融入VR/AR,以增强用户体验的自然度与沉浸感。然而,在虚拟数字人领域,图像生成及换脸技术在VR/AR环境下仍遇诸多挑战,尤其是唇形合成模型在动态场景及多语言环境下的性能需进一步优化。为解决上述问题,本文提出VR/AR-AdaptFace模型,一个面向虚拟现实与增强现实的自适应多模态面部替换方案。该模型由两大模块构成:“文颜绘真”模块,采用先进的文本至图像转换技术和特定类别先验保存策略,优化虚拟人脸生成,并通过注意力机制大幅提升图像质量;“语唇映生”模块,依托强大的生成器、唇形同步判别器及视觉质量判别器,实现语音与唇形的精准同步,为VR/AR场景中的动态交互带来更加逼真的体验。With the rapid advancement of VR and AR technologies,there is a growing demand for immersive experiences.At the same time,virtual face technology is also becoming mature.Based on this,in this paper the integration of highly realistic virtual faces into VR/AR was explored to enhance the naturalness and immersion of user experience.However,in the field of virtual digital human,image generation and face-swapping techniques still encounter many challenges in VR/AR environments,especially the lip-synthesis model needs to be further optimised in dynamic scenes and multi-language environments.To solve the above problems,in this paper the VR/AR-AdaptFace model,an adaptive multimodal face replacement scheme for virtual reality and augmented reality,is proposed.The model consists of two major modules:the"text-to-image"module,which uses advanced text-to-image conversion techniques and a category-specific priori retention strategies to optimise virtual face generation,and significantly improves the image quality through the attention mechanism;and the"speech-to-lip reflection"module,which relies on a powerful generator,lip synchronisation discriminator and visual quality discriminator to achieve accurate synchronisation between speech and lip shape,bringing a more realistic experience for dynamic interaction in VR/AR scenes.
关 键 词:人脸合成 细节增强模型 动态视频唇形合成 虚拟现实 增强现实
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.21.125.27