The past decade has witnessed the impressive and steady development of single-modal AI technologies in several fields,thanks to the emergence of deep learning.Less studied,however,is multi-modal AI-commonly considered...
This work was supported by National Natural Science Foundation of China(Nos.62006225,61906199 and 62071468);the Strategic Priority Research Program of Chinese Academy of Sciences(CAS),China(No.XDA 27040700);sponsored by The Beijing Nova Program,China(Nos.Z201100006820050 and Z211100002121010).
In the daily application of an iris-recognition-at-a-distance(IAAD)system,many ocular images of low quality are acquired.As the iris part of these images is often not qualified for the recognition requirements,the mor...
supported by National Natural Science Foundation of China(Nos.61872256 and 62102205);Key-Area Research and Development Program of Guangdong Province,China(No.2021B0101400002);Peng Cheng Laboratory Key Research Project,China(No.PCL 2021A07);Multi-source Cross-platform Video Analysis and Understanding for Intelligent Perception in Smart City,China(No.U20B2052).
With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Insp...
supported in part by the Australian Research Council(ARC)(Nos.FL-170100117,DP-180103424,IC-190100031 and LE-200100049).
Concept learning constructs visual representations that are connected to linguistic semantics, which is fundamental to vision-language tasks. Although promising progress has been made, existing concept learners are st...
supported in part by National Natural Science Foundation of China(Nos.62002395,61976250 and U1811463);the National Key R&D Program of China(No.2021ZD0111601);the Guangdong Basic and Applied Basic Research Foundation,China(Nos.2021A15150123 and 2020B1515020048).
Visual representation learning is ubiquitous in various real-world applications,including visual comprehension,video understanding,multi-modal analysis,human-computer interaction,and urban computing.Due to the emergen...
supported by National Natural Science Foundation of China(Nos.61976209 and 62020106015);the CAS International Collaboration Key Project,China(No.173211KYSB20190024);the Strategic Priority Research Program of CAS,China(No.XDB32040000)。
Nowadays,deep neural networks(DNNs)have been equipped with powerful representation capabilities.The deep convolutional neural networks(CNNs)that draw inspiration from the visual processing mechanism of the primate ear...