检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王中磐 袁野 李清都 万里红 刘娜 WANG Zhongpan;YUAN Ye;LI Qingdu;WAN Lihong;LIU Na(School of Health Science and Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China;School of Electronics,Information and Electrical Engineering(SEIEE),Shanghai Jiao Tong University,Shanghai 200030,China;Origin Dynamics Intelligent Robot Co.,Ltd.,Zhengzhou 450018,China)
机构地区:[1]上海理工大学健康科学与工程学院,上海200093 [2]上海交通大学电子信息与电气工程学院,上海200030 [3]中原动力智能机器人有限公司,河南郑州450018
出 处:《软件导刊》2023年第9期52-58,共7页Software Guide
基 金:国家自然科学基金项目(62006165);上海市浦江人才计划项目(2019PJD035);上海市人工智能创新发展专项资金项目(2019RGZN01041)。
摘 要:现实垃圾数据集通常呈现严重的类别不平衡的长尾分布现象,导致传统深度学习模型在进行垃圾分类和识别任务时存在泛化性不高的问题。为此,提出一种新的数据重标记算法与框架以提升保洁机器人识别、分类垃圾的泛化程度与精确度。该算法包含特征提取、特征聚类、标签映射模块,在训练常用的分类模型时,通过分析数据集的数据分布情况,将特征提取模块的特征向量输入特征聚类模块后为每个类别生成几个子类,并为之分配一个相应的伪标签,以缓解标签层面的数据不平衡问题。同时,在预测时通过标签映射模块,将伪标签转换为真实标签。实验表明,所提算法能在不损失头部类性能的前提下,显著提升垃圾长尾数据集中尾部类的性能,重标记算法能显著提升baseline中不同类别不平衡学习方法在长尾垃圾数据集上的分类精度。Real garbage dataset usually presents a serious long tail distribution phenomenon of unbalanced categories,which leads to the problem that the generalization of the traditional deep learning model is not high when performing waste sorting and recognition tasks.To this end,a new data re labeling algorithm and framework are proposed to improve the generalization and accuracy of cleaning robot recognition and garbage classification.This algorithm includes feature extraction,feature clustering,and label mapping modules.When training commonly used classification models,by analyzing the data distribution of the dataset,the feature vectors of the feature extraction module are input into the feature clustering module to generate several subcategories for each category,and corresponding pseudo labels are assigned to them to alle-viate the problem of data imbalance at the label level.At the same time,during prediction,pseudo labels are converted into real labels through the label mapping module.The experiment shows that the proposed algorithm can significantly improve the performance of tail classes in gar-bage long tailed datasets without losing the performance of the head class,and the relabeling algorithm can significantly improve the classifica-tion accuracy of imbalanced learning methods for different categories in the baseline on long tailed garbage datasets.
关 键 词:垃圾分类 深度学习 类别不平衡学习 数据重标记 数据集分析 特征聚类 图像处理 计算机视觉
分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.222.32.191