Important sampling based active learning for imbalance classification  

在线阅读下载全文

作  者:Xinyue WANG Bo LIU Siyu CAO Liping JING Jian YU 

机构地区:[1]School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China [2]Beijing Key Lab of Traffic Data Analysis and Mining,Beijing Jiaotong University,Beijing 100044,China [3]College of Information Science and Technology,Hebei Agricultural University,Baoding 071001,China

出  处:《Science China(Information Sciences)》2020年第8期192-205,共14页中国科学(信息科学)(英文版)

基  金:National Natural Science Foundation of China(Grant Nos.61822601,61773050,61632004,61972132);Beijing Natural Science Foundation(Grant No.Z180006);National Key Research and Development Program(Grant No.2017YFC1703506);Fundamental Research Funds for the Central Universities(Grant Nos.2019JBZ110,2019YJS040);Youth Foundation of Hebei Education Department(Grant No.QN2018084);Science and Technology Foundation of Hebei Agricultural University(Grant No.LG201804);Research Project for Self-cultivating Talents of Hebei Agricultural University(Grant No.PY201810)。

摘  要:Imbalance in data distribution hinders the learning performance of classifiers.To solve this problem,a popular type of methods is based on sampling(including oversampling for minority class and undersampling for majority class)so that the imbalanced data becomes relatively balanced data.However,they usually focus on one sampling technique,oversampling or undersampling.Such strategy makes the existing methods suffer from the large imbalance ratio(the majority instances size over the minority instances size).In this paper,an active learning framework is proposed to deal with imbalanced data by alternative performing important sampling(ALIS),which consists of selecting important majority-class instances and generating informative minority-class instances.In ALIS,two important sampling strategies affect each other so that the selected majority-class instances provide much clearer information in the next oversampling process,meanwhile the generated minority-class instances provide much more sufficient information for the next undersampling procedure.Extensive experiments have been conducted on real world datasets with a large range of imbalance ratio to verify ALIS.The experimental results demonstrate the superiority of ALIS in terms of several well-known evaluation metrics by comparing with the state-of-the-art methods.

关 键 词:imbalance classification important sampling active learning OVERSAMPLING UNDERSAMPLING 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论] TP181[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象