Audio2AB:Audio-driven collaborative generation of virtual character animation  

在线阅读下载全文

作  者:Lichao NIU Wenjun XIE Dong WANG Zhongrui CAO Xiaoping LIU 

机构地区:[1]School of Computer Science and Information Engineering,Hefei University of Technology,Heifei 230009,China [2]School of Software,Hefei University of Technology,Heifei 230009,China [3]Anhui Province Key Laboratory of Industry Safety and Emergency Technology,Hefei University of Technology,Hefei 230601,China

出  处:《虚拟现实与智能硬件(中英文)》2024年第1期56-70,共15页Virtual Reality & Intelligent Hardware

基  金:Supported by the National Natural Science Foundation of China (62277014);the National Key Research and Development Program of China (2020YFC1523100);the Fundamental Research Funds for the Central Universities of China (PA2023GDSK0047)。

摘  要:Background Considerable research has been conducted in the areas of audio-driven virtual character gestures and facial animation with some degree of success.However,few methods exist for generating full-body animations,and the portability of virtual character gestures and facial animations has not received sufficient attention.Methods Therefore,we propose a deep-learning-based audio-to-animation-and-blendshape(Audio2AB)network that generates gesture animations and ARK it's 52 facial expression parameter blendshape weights based on audio,audio-corresponding text,emotion labels,and semantic relevance labels to generate parametric data for full-body animations.This parameterization method can be used to drive full-body animations of virtual characters and improve their portability.In the experiment,we first downsampled the gesture and facial data to achieve the same temporal resolution for the input,output,and facial data.The Audio2AB network then encoded the audio,audio-corresponding text,emotion labels,and semantic relevance labels,and then fused the text,emotion labels,and semantic relevance labels into the audio to obtain better audio features.Finally,we established links between the body,gestures,and facial decoders and generated the corresponding animation sequences through our proposed GAN-GF loss function.Results By using audio,audio-corresponding text,and emotional and semantic relevance labels as input,the trained Audio2AB network could generate gesture animation data containing blendshape weights.Therefore,different 3D virtual character animations could be created through parameterization.Conclusions The experimental results showed that the proposed method could generate significant gestures and facial animations.

关 键 词:Audio-driven Virtual character Full-body animation Audio2AB Blendshape GAN-GF 

分 类 号:TN912.3[电子电信—通信与信息系统] TP391.41[电子电信—信息与通信工程] J218.7[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象