近邻感知的标签噪声过滤算法  被引量:9

Label Noise Filtering via Perception of Nearest Neighbors

在线阅读下载全文

作  者:姜高霞[1] 樊瑞宣 王文剑[1,2] JIANG Gaoxia;FAN Ruixuan;WANG Wenjian(School of Computer and Information Technology,Shanxi University,Taiyuan 030006;Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education,Shanxi University,Taiyuan 030006)

机构地区:[1]山西大学计算机与信息技术学院,太原030006 [2]山西大学计算智能与中文信息处理教育部重点实验室,太原030006

出  处:《模式识别与人工智能》2020年第6期518-529,共12页Pattern Recognition and Artificial Intelligence

基  金:国家自然科学基金项目(No.61673249,U1805263,61906113);山西省国际合作重点研发计划项目(No.201903D421050);山西省高等学校科技创新项目(No.2020L0007)资助。

摘  要:基于k近邻的标签噪声过滤对近邻参数k的选取较敏感.针对此问题,文中提出近邻感知的标签噪声过滤算法,可有效解决二分类数据集的类内标签噪声的问题.算法分开考虑正类样本和负类样本,使分类问题中的标签噪声检测问题转化为两个单类别数据的离群点检测问题.首先通过近邻感知策略自动确定每个样本的个性化近邻参数,避免近邻参数敏感的问题.然后根据噪声因子将样本分为核心样本与非核心样本,并把非核心样本作为标签噪声候选集.最后结合候选样本的近邻标签信息,进行噪声的识别与过滤.实验表明,文中方法的噪声过滤效果和分类预测性能均较优.Label noise filtering algorithms based on k nearest neighbor are sensitive to the neighbor parameter k.Aiming at this problem,a label noise filtering algorithm based on perception of nearest neighbors(PNN)is proposed to solve the problem of intra-class label noise in binary classification datasets effectively.Positive and negative samples are considered separately in PNN,and thus the label noise detection problem in classification is transformed into two outlier detection problems with single-class data.Firstly,the personalized neighbor parameter is determined automatically by the neighbor perception strategy to avoid the sensitivity of neighbor parameter.Secondly,all samples are divided into core samples and non-core samples by noise factor.The non-core samples are taken as the candidates of label noise.Finally,the noise is identified and filtered by combining the label information of the nearest neighbors of the candidate samples.Experiments indicate that the proposed algorithm performs well in noise filtering and classification prediction.

关 键 词:标签噪声过滤 近邻感知 个性化k近邻 离群点检测 噪声因子 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象