类不平衡的公共和标签特定特征多标签分类  被引量:1

Class Imbalance Multi-label Classification with Common and Label Specific Features

在线阅读下载全文

作  者:张海翔 李培培[2] 胡学钢[2] ZHANG Hai-xiang;LI Pei-pei;HU Xue-gang(Information Division,The Second People's Hospital of Hefei Affiliated to Bengbu Medical College,Hefei 230012,China;Key Laboratory of Knowledge Engineering with Big Data(Hefei University of Technology),Ministry of Education,Hefei 230601,China)

机构地区:[1]蚌埠医学院附属合肥市第二人民医院讯息处,安徽合肥230012 [2]合肥工业大学大数据知识工程教育部重点实验室,安徽合肥230601

出  处:《计算机技术与发展》2024年第2期46-52,共7页Computer Technology and Development

基  金:国家自然科学基金资助项目(61976077,62076085,62120106008);蚌埠医学院科技计划项目(2022byzd225sk)。

摘  要:多标签分类主要解决实例数据对应多个标签问题,现有多标签方法大多利用所有特征组成的相同数据表示来区分所有标签,由于每个标签自身特点不同,统一的特征不能完全区分标签,给模型训练带来负面作用和时间成本增加,如何利用对每个标签而言最具有辨别力的特征来提高模型分类性能成为一种难题,此外现实中类不平衡问题同样会导致多标签学习模型的性能下降。基于此,提出一种类不平衡的公共和标签特定特征多标签分类方法。首先,找到种子实例的最近邻居,然后通过插值技术得到合成实例的特征来解决类不平衡问题;其次,为了找出对每个标签最具代表性的特征,引入l1,l2,1正则化约束系数矩阵提取标签的特定特征和公共特征;最后,使用标签相关性实现关联标签的模型输出相似,实例相关性保证关联特征共享对应标签分布信息提高分类性能。实验表明所提方法与其他多标签分类方法相比获得了更好的分类精度。Multi-label classification mainly deals with the problem that instances data is associated with multiple class labels.Most of the existing multi-label methods use the same data representation consisting of all features to distinguish all labels.However,due to the different characteristics of each label,unified features cannot fully differentiate them,which brings negative effects and increases time cost to model training.Therefore,it becomes a challenge to improve the model classification performance by utilizing the most discriminative features for each label.In addition,the problem of class imbalance in reality can also result in a decline in the performance of multi-label learning models.Motivated by this,we propose a new approach of class imbalance multi-label classification with common and label specific features.Firstly,we find the nearest neighbors of seed instances,and then use interpolation techniques to obtain the features of synthetic instances to solve the problem of class imbalance.Secondly,in order to find the most representative features for each label,we introduce l 1-norm and l 2,1-norm regularizers constraint coefficient matrix to extract label-specific features and common features.Finally,we use label correlation to achieve similar model output of associated labels,and instance correlation to ensure that associated features share corresponding label distribution information to improve classification performance.Extensive experiments show a competitive performance of proposed method against other multi-label learning approaches.

关 键 词:多标签分类 类不平衡 公共特征 标签特定特征 标签相关性 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象