Denoising diffusion models have demonstrated tremendous success in modeling data distributions and synthesizing high-quality samples.In the 2D image domain,they have become the state-of-the-art and are capable of gene...
supported by the European Research Council(ERC,Advanced Grant Number 742870;the Swiss National Science Foundation(SNF,Grant Numbers 200021 and 192356);the National Natural Science Foundation of China(Grant Number 62476143).
Inspired by Minsky’s Society of Mind,Schmidhuber’s Learning to Think,and other more 9-16 recent works,this paper proposes and advocates for the concept of natural language-based societies of mind(NLSOMs).We imagine ...
The use of pretrained backbones with finetuning has shown success for 2D vision and natural language processing tasks,with advantages over taskspecific networks.In this paper,we introduce a pretrained 3D backbone,call...
supported by RCUK grant CAMERA(EP/M023281/1,EP/T022523/1);the Centre for Augmented Reasoning(CAR)at the Australian Institute for Machine Learning,and a gift from Adobe.
Storyboards comprising key illustrations and images help filmmakers to outline ideas,key moments,and story events when filming movies.Inspired by this,we introduce the first contextual benchmark dataset Script-to-Stor...
supported by“Pioneer”and“Leading Goose”R&D Program of Zhejiang(No.2023C01181);supported by National Natural Science Foundation of China(No.62302134);Zhejiang Provincial Natural Science Foundation(No.LQ24F020031);supported by Information Technology Center and State Key Lab of CAD&CG,Zhejiang University.
In this study,we propose a novel method to reconstruct the 3D shapes of transparent objects using images captured by handheld cameras under natural lighting conditions.It combines the advantages of an explicit mesh an...
supported by the Zhuhai Industry-University-Research Project(No.2220004002411);National Key R&D Program of China(No.2021YFE0205700);Science and Technology Development Fund of Macao(Nos.0070/2020/AMJ,00123/2022/A3,and 0096/2023/RIA2);Zhuhai City Polytechnic Research Project(No.2024KYBS02);Shenzhen Science and Technology Innovation Committee(No.SGDX20220530111001006);the University of Macao under Grants MYRG(Nos.GRG2023-00061-FST UMDF and 2022-00084-FST)。
Point cloud completion aims to infer complete point clouds based on partial 3D point cloud inputs.Various previous methods apply coarseto-fine strategy networks for generating complete point clouds.However,such method...
Real-world blind image super-resolution is a challenging problem due to the absence of target high resolution images for training.Inspired by the recent success of the single image generation based method SinGAN,we ta...
Language-guided fashion image editing is challenging,as fashion image editing is local and requires high precision,while natural language cannot provide precise visual information for guidance.In this paper,we propose...
supported by the National Natural Science Foundation of China under Grant Nos.62171038,61827901,and 62088101.
Visible and infrared image fusion(VIF)aims to combine information from visible and infrared images into a single fused image.Previous VIF methods usually employ a color space transformation to keep the hue and saturat...
supported by the National Natural Science Foundation of China(No.61972227);the Natural Science Foundation of Shandong Province(No.ZR201808160102);Shandong Provincial Natural Science Foundation Key Project(No.ZR2020KF015);the Key Research and Development Project of Shandong Province(No.2019GSF109112);the Science and Technology Plan for Young Talents in Colleges and Universities of Shandong Province(No.2020KJN007);the Scientific Research Studio in Colleges and Universities of Ji’nan City(No.2021GXRC092);the Science and Technology Research Program for Colleges and Universities in Shandong Province(No.KJ2018BZN029).
Rain streaks in an image appear in different sizes and orientations,resulting in severe blurring and visual quality degradation.Previous CNNbased algorithms have achieved encouraging deraining results although there a...