检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张海翔 李培培[2] 胡学钢[2] ZHANG Hai-xiang;LI Pei-pei;HU Xue-gang(Information Division,The Second People's Hospital of Hefei Affiliated to Bengbu Medical College,Hefei 230012,China;Key Laboratory of Knowledge Engineering with Big Data(Hefei University of Technology),Ministry of Education,Hefei 230601,China)
机构地区:[1]蚌埠医学院附属合肥市第二人民医院讯息处,安徽合肥230012 [2]合肥工业大学大数据知识工程教育部重点实验室,安徽合肥230601
出 处:《计算机技术与发展》2024年第2期46-52,共7页Computer Technology and Development
基 金:国家自然科学基金资助项目(61976077,62076085,62120106008);蚌埠医学院科技计划项目(2022byzd225sk)。
摘 要:多标签分类主要解决实例数据对应多个标签问题,现有多标签方法大多利用所有特征组成的相同数据表示来区分所有标签,由于每个标签自身特点不同,统一的特征不能完全区分标签,给模型训练带来负面作用和时间成本增加,如何利用对每个标签而言最具有辨别力的特征来提高模型分类性能成为一种难题,此外现实中类不平衡问题同样会导致多标签学习模型的性能下降。基于此,提出一种类不平衡的公共和标签特定特征多标签分类方法。首先,找到种子实例的最近邻居,然后通过插值技术得到合成实例的特征来解决类不平衡问题;其次,为了找出对每个标签最具代表性的特征,引入l1,l2,1正则化约束系数矩阵提取标签的特定特征和公共特征;最后,使用标签相关性实现关联标签的模型输出相似,实例相关性保证关联特征共享对应标签分布信息提高分类性能。实验表明所提方法与其他多标签分类方法相比获得了更好的分类精度。Multi-label classification mainly deals with the problem that instances data is associated with multiple class labels.Most of the existing multi-label methods use the same data representation consisting of all features to distinguish all labels.However,due to the different characteristics of each label,unified features cannot fully differentiate them,which brings negative effects and increases time cost to model training.Therefore,it becomes a challenge to improve the model classification performance by utilizing the most discriminative features for each label.In addition,the problem of class imbalance in reality can also result in a decline in the performance of multi-label learning models.Motivated by this,we propose a new approach of class imbalance multi-label classification with common and label specific features.Firstly,we find the nearest neighbors of seed instances,and then use interpolation techniques to obtain the features of synthetic instances to solve the problem of class imbalance.Secondly,in order to find the most representative features for each label,we introduce l 1-norm and l 2,1-norm regularizers constraint coefficient matrix to extract label-specific features and common features.Finally,we use label correlation to achieve similar model output of associated labels,and instance correlation to ensure that associated features share corresponding label distribution information to improve classification performance.Extensive experiments show a competitive performance of proposed method against other multi-label learning approaches.
关 键 词:多标签分类 类不平衡 公共特征 标签特定特征 标签相关性
分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.219.241.228