检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:曹小敏 刘进锋 CAO Xiaomin;LIU Jinfeng(College of Information Engineering,Ningxia University,Yinchuan 750021,China)
出 处:《郑州大学学报(理学版)》2024年第5期31-38,共8页Journal of Zhengzhou University:Natural Science Edition
基 金:宁夏自然科学基金项目(2023AAC03126)。
摘 要:为了解决数据的长尾分布容易造成网络模型识别准确度下降的问题,提出了一种基于因果推断的两阶段长尾分类模型。首先采用重加权的方法去除特征和标签之间可能存在的虚假关联;其次通过平衡微调进一步提升模型在少样本尾部类别识别的准确率。模型可分为两个阶段:第一阶段设计了具有迭代优化效果的去相关样本重加权算法以去除虚假相关,达到稳定预测的效果;第二阶段设计了基于CAM的类平衡采样算法进行平衡微调训练,使来自不平衡数据集的学习特征在所有类别之间转移和重新平衡,以提高模型在尾部类别的分类性能。实验结果证明了模型具有较优的性能,同时,无论从理论层面还是数据层面都具有较好的可解释性。In order to solve the problem caused by long-tail distribution of data,which might decrease network model recognition accuracy to decrease,a two-stage long-tail classification model based on causal inference was proposed.Firstly,a re-weighting approach in the model was used to remove possible spurious associations between features and labels,and secondly the recognition accuracy of the model in tail categories with fewer samples was improved by balancing fine-tuning.The model was divided into two stages.In the first stage,a de-correlated sample reweighting algorithm with iterative optimization effect was designed to remove spurious correlation and achieve stable prediction;in the second stage,a CAM-based class balancing sampling algorithm was designed for balancing fine-tuning training,so that the learned features from unbalanced datasets were transferred and rebalanced among all classes to improve the classification performance of the model in the tail category.The experiments proved that the model had superior performance.Meanwhile,compared with other model,this model had better interpretability from the theoretical level as well as the data level.
关 键 词:长尾分布 因果推断 去相关 类平衡采样 可解释性
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49