检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈泳财 张强 黄咏秋 甄先通 张磊 CHEN Yongcai;ZHANG Qiang;HUANG Yongqiu;ZHEN Xiantong;ZHANG Lei(School of Computer Science,Guangdong University of Petrochemical Technology,Maoming 525000,China)
机构地区:[1]广东石油化工学院计算机学院,广东茂名525000
出 处:《广东石油化工学院学报》2024年第4期93-99,共7页Journal of Guangdong University of Petrochemical Technology
基 金:国家自然科学基金项目(62476064);广东省自然科学基金项目(2022A1515011527,2024A1515010455)。
摘 要:视觉语言模型由于其出色的泛化性能,近两年在众多领域表现出很好的性能。但在专业领域数据上,如润滑油中的铁谱数据,视觉语言模型的泛化性能遇到挑战。如何在少量数据情况下快速使视觉语言模型适应特定领域,实现铁谱图像的自适应学习,是一个新的挑战。研究提出一种新的视觉语言模型和大语言模型结合的自适应元学习方法。该方法在视觉语言模型基础上,利用大语言模型重新生成文本描述,如对不同类别的铁谱数据,生成涵盖成因、形态、大小和颜色等方面的文本描述,利用多角度的铁谱线索,对视觉语言模型微调,使其更适合铁谱这样的专业数据,在专业领域架构起图像和文本之间的语义桥梁,提升零样本识别能力。并在少量样本情况下,引入自适应元学习方法,实现对铁谱图像的快速自适应,进一步提升性能。实验结果表明自适应元学习方法在铁谱图像磨损类型识别中的有效性。Vision-language models have achieved significant performance in various domains in recent years,due to their exceptional generalization capabilities.However,their generalization performance is challenged when it is applied to specialized domain data,such as ferrography data in lubricating oil.Adapting vision-language models to specific domains with limited data,especially for the adaptive learning of ferrography images,presents a new challenge.This study proposes a novel adaptive meta-learning approach that combines vision-language models with large language models,termed CLAML(CLIP-LLM Adaptation on Meta Learning).The CLAML method is based on the CLIP model,leveraging large language models(LLMs)to regenerate text descriptions.For different categories of ferrography data,LLMs generate text descriptions covering various aspects such as causes,morphology,size,and color.The multi-perspective ferrography information is then used to fine-tune the CLIP model,making it more suitable for specialized data like ferrography.The approach establishes a semantic bridge between images and text in specialized domains,enhancing zero-shot recognition capabilities.Additionally,by incorporating adaptive meta-learning methods in scenarios with limited samples,the model achieves rapid adaptation to ferrography images,and further improves their performance.Experimental results demonstrate the effectiveness of CLAML method in identifying wear types in ferrography images.
关 键 词:视觉语言模型 大语言模型 铁谱图像分类 零样本学习 自适应元学习
分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.137.210.249