This work was supported by National Natural Science Foundation of China(No.62176006);the National Key Research and Development Program of China(No.2022YFF0902302).
In recent years,computing art has developed rapidly with the in-depth cross study of artificial intelligence generated con-tent(AIGC)and the main features of artworks.Audio-visual content generation has gradually been...
This work was supported by Double First-Class Innovation Research Project for People’s Public Security University of China(No.2023SYL06).
Score-based multimodal biometric fusion has been shown to be successful in addressing the problem of unimodal techniques’vulnerability to attack and poor performance in low-quality data.However,difficulties still exi...
supported by Natural Science Foundation of China(Nos.62006224 and 62122088).
Machine translation is an important and challenging task that aims at automatically translating natural language sentences from one language into another.Recently,Transformer-based neural machine translation(NMT)has a...
supported by the National Natural Science Foundation of China(No.62036006);the Fundamental Research Funds for the Central Universities,China;the Innovation Fund of Xidian University,China.
With the growing awareness of data privacy,federated learning(FL)has gained increasing attention in recent years as a major paradigm for training models with privacy protection in mind,which allows building models in ...
Cross-modal image-text retrieval is a fundamental task in bridging vision and language. It faces two main challenges that are typically not well addressed in previous works. 1) Generalizability: Existing methods often...
Multimodal sentence summarization(MMSS)is a new yet challenging task that aims to generate a concise summary of a long sentence and its corresponding image.Although existing methods have gained promising success in MM...
supported by the National Natural Science Foundation of China(No.62072462);the National Key R&D Program of China(No.2020AAA0108600);the Large-scale Pretraining Program 468 of Beijing Academy of Artificial Intelligence(BAAI).
Multimodal pretraining has made convincing achievements in various downstream tasks in recent years.However,since the majority of the existing works construct models based on English,their applications are limited by ...
supported by the Key Research Program of the Chinese Academy of Sciences(No.ZDBSSSW-JSC006);the Strategic Priority Research Program of the Chinese Academy of Sciences(No.XDA 27030300).
In the past few years,the emergence of pre-training models has brought uni-modal fields such as computer vision(CV)and natural language processing(NLP)to a new era.Substantial works have shown that they are beneficial...
Many isolation approaches, such as zoning search, have been proposed to preserve the diversity in the decision space of multimodal multi-objective optimization(MMO). However, these approaches allocate the same computi...