Research Institute for Advanced Manufacturing(RIAM)of The Hong Kong Polytechnic University(1-CDJT);Intra-Faculty Interdisciplinary Project 2023/24(1-WZ4N);Research Committee of The Hong Kong Polytechnic University;State Key Laboratory of Intelligent Manufacturing Equipment and Technology,Huazhong University of Science and Technology(IMETKF2024010);Guangdong-Hong Kong Technology Cooperation Funding Scheme(GHX/075/22GD);Innovation and Technology Commission(ITC);COMAC International Collaborative Research Project(COMAC-SFGS-2023-3148);General Research Fund from the Research Grants Council of the Hong Kong Special Administrative Region,China(Project Nos.PolyU15210222 and PolyU15206723);Open access funding provided by the Hong Kong Polytechnic University.
human-robot collaboration(HRC)is set to transform the manufacturing paradigm by leveraging the strengths of human flexibility and robot precision.The recent breakthrough of Large Language Models(LLMs)and Vision-Langua...
Supported by the National Natural Science Foundation of China(No.62001313);the Liaoning Professional Talent Protect(No.XLYC2203046);the Shenyang Municipal Medical Engineering Cross Research Foundation of China(No.22-321-32-09).
Medical visual question answering(MedVQA)aims to enhance diagnostic confidence and deepen patientsunderstanding of their health conditions.While the Transformer architecture is widely used in multimodal fields,its app...
1 Introduction In recent years,foundation Vision-Language Models(VLMs),such as CLIP[1],which empower zero-shot transfer to a wide variety of domains without fine-tuning,have led to a significant shift in machine learn...
In multimodal learning, Vision-Language Models (VLMs) have become a critical research focus, enabling the integration of textual and visual data. These models have shown significant promise across various natural lang...
National Key Research and Development Program of China,Grant/Award Number:2022YEB4700400;National Natural Science Foundation of China,Grant/Award Numbers:U22B2041,62173056;Education Department of Hainan Province,Grant/Award Number:Hnky2024ZD‐19。
In the past decades,substantial progress has been made in human action recognition.However,most existing studies and datasets for human action recognition utilise still images or videos as the primary modality.Image-b...
supported by National Key Research and Development Program of China (Grant No.2022ZD0160403);National Natural Science Foundation of China (Grant No.62176178)。
Driven by the expansion of foundation models and the increasing variety of downstream tasks,parameter-efficient fine-tuning(PEFT) methods have exhibited remarkable efficacy in the unimodal domain,effectively mitigatin...
supported in part by the United States Department of Agriculture(USDA)National Institute of Food and Agriculture(NIFA);Award Number 2023-67021-40614.
The transformation of age-old farming practices through the integration of digitization and automation has sparked a revolution in agriculture that is driven by cutting-edge computer vision and artificial intelligence...
supported by the Science and Technology Major Project of Fujian Province of China(Grant No.2022HZ028018);the National Natural Science Foundation of China(Grant No.51907030).
The state of health(SOH)and remaining useful life(RUL)of lithium-ion batteries are crucial for health management and diagnosis.However,most data-driven estimation methods heavily rely on scarce labeled data,while trad...