检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:武优 王静 李培培[1,2] 胡学钢[1,2,3] WU You;WANG Jing;LI Peipei;HU Xuegang(School of Computer Science and Information Engineering,Hefei University of Technology,Hefei 230601,China;Key Laboratory of Knowledge Engineering with Big Data,Hefei University of Technology,Hefei 230601,China;Anhui Province Key Laboratory of Industry Safety and Emergency Technology,Hefei University of Technology,Hefei 230601,China)
机构地区:[1]合肥工业大学计算机与信息学院,合肥230601 [2]大数据知识工程教育部重点实验室(合肥工业大学),合肥230601 [3]工业安全与应急技术安徽省重点实验室(合肥工业大学),合肥230601
出 处:《计算机科学》2025年第4期161-168,共8页Computer Science
基 金:国家自然科学基金(62376085,62076085,62120106008);大健康研究院健康大数据与群体医学研究所专项资金(JKS2023003)。
摘 要:多标签特征选择是一种有效的特征降维技术,旨在从原始特征空间中筛选出具有区分力的特征子集。然而,传统的多标签特征选择方法面临着标注精度下降的问题。在真实的数据中,实例被候选标签集标注,候选标签除相关标签外,还混杂着噪声标签,即偏多标签数据。现有的多标签特征选择算法通常假设训练样本被精确标注,或者只考虑标签缺失的情况。并且,在现实情形中,大规模高维多标签数据集往往只有小部分数据被标注。因此,文中提出一种新颖的半监督偏多标签特征选择方法。首先,针对偏多标签问题,从已知标签的样本中学习标签之间的真实关系,然后利用流形正则化技术维持特征空间与标签空间的结构一致性。其次,针对标签缺失问题,通过标签传播算法来增强标签信息。另外,针对高维特征问题,对映射矩阵施加低秩约束,以揭示标签间的隐性联系,并通过引入l_(2,1)范数约束来选择具有较强区分能力的特征。实验结果表明,与现有的半监督多标签特征选择方法相比,所提方法在性能上存在显著的优势。Multi-label feature selection is a technique for reducing feature dimensionality by filtering out a subset of features with distinguishing power from the original feature space.However,the traditional method faces the problem of labeling accuracy degradation.Real data instances are labeled with a set of candidate labels,which may include noise labels in addition to relevant labels,resulting in biased multi-label data.Existing multi-label feature selection algorithms typically assume accurate labeling of training samples or only consider missing labels.Furthermore,large-scale high-dimensional multi-labeled datasets in real situations often have only a small portion of labeled data.Therefore,this paper presents a new semi-supervised biased multi-label feature selection method.Firstly,considering the partial multi-label issue,this paper learns the true relationships between labels from samples with known labels.Then,the structural consistency between the feature space and the label space is maintained by using the stream regularization technique.Secondly,considering the label missing issue,this paper considers unlabeled data and enhance the label information by a label propagation algorithm.Additionally,considering the high-dimensional feature,this paper applies low-rank constraints to the mapping matrix to expose implicit connections between labels.It also selects features with strong distinguishing ability by introducing l 2,1 norm constraints.Experimental results demonstrate significant performance advantages of our method compared to existing semi-supervised multi-label feature selection methods.
关 键 词:多标签特征选择 偏多标签学习 半监督学习 特征降维 噪声标签
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.33