supported by the General Program of the National Natural Science Foundation of China(Grant No.61977029).
Generating realistic and synthetic video from text is a highly challenging task due to the multitude of issues involved,including digit deformation,noise interference between frames,blurred output,and the need for tem...