Vision-language model-based human-robot collaboration for smart manufacturing:A state-of-the-art survey

作　　者：Junming FAN Yue YIN Tian WANG Wenhang DONG Pai ZHENG Lihui WANG

机构地区：[1]Department of Industrial and Systems Engineering,The Hong Kong Polytechnic University,Hong Kong 999077,China [2]Department of Production Engineering,KTH Royal Institute of Technology,Stockholm,Sweden

出　　处：《Frontiers of Engineering Management》2025年第1期177-200,共24页工程管理前沿(英文版)

基　　金：Research Institute for Advanced Manufacturing(RIAM)of The Hong Kong Polytechnic University(1-CDJT);Intra-Faculty Interdisciplinary Project 2023/24(1-WZ4N);Research Committee of The Hong Kong Polytechnic University;State Key Laboratory of Intelligent Manufacturing Equipment and Technology,Huazhong University of Science and Technology(IMETKF2024010);Guangdong-Hong Kong Technology Cooperation Funding Scheme(GHX/075/22GD);Innovation and Technology Commission(ITC);COMAC International Collaborative Research Project(COMAC-SFGS-2023-3148);General Research Fund from the Research Grants Council of the Hong Kong Special Administrative Region,China(Project Nos.PolyU15210222 and PolyU15206723);Open access funding provided by the Hong Kong Polytechnic University.

摘　　要：human-robot collaboration(HRC)is set to transform the manufacturing paradigm by leveraging the strengths of human flexibility and robot precision.The recent breakthrough of Large Language Models(LLMs)and Vision-Language Models(VLMs)has motivated the preliminary explorations and adoptions of these models in the smart manufacturing field.However,despite the considerable amount of effort,existing research mainly focused on individual components without a comprehensive perspective to address the full potential of VLMs,especially for HRC in smart manufacturing scenarios.To fill the gap,this work offers a systematic review of the latest advance-ments and applications of VLMs in HRC for smart manu-facturing,which covers the fundamental architectures and pretraining methodologies of LLMs and VLMs,their applications in robotic task planning,navigation,and manipulation,and role in enhancing human-robot skill transfer through multimodal data integration.Lastly,the paper discusses current limitations and future research directions in VLM-based HRC,highlighting the trend in fully realizing the potential of these technologies for smart manufacturing.

关键词：vision-language models large language models human-robot collaboration smart manufacturing

分类号：TG1[金属学及工艺—金属学]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Vision-language model-based human-robot collaboration for smart manufacturing:A state-of-the-art survey

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Vision-language model-based human-robot collaboration for smart manufacturing:A state-of-the-art survey

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索