平衡化图半监督学习方法  

BALANCED GRAPH BASED SEMI-SUPERVISED LEARNING METHOD

在线阅读下载全文

作  者:张燕[1] 张晨光[1] 张夏欢 

机构地区:[1]海南大学信息科学技术学院,海口570228 [2]北京凌云光视公司图像处理部,北京100097

出  处:《系统科学与数学》2016年第8期1107-1118,共12页Journal of Systems Science and Mathematical Sciences

基  金:海南省自然科学基金资助项目(20166211);海南省高等学校科学研究项目(Hjkj2012-01);国家自然科学基金(11261015)资助课题

摘  要:许多机器学习的实际应用中都存在数据不平衡问题,即某类的样本数目要远小于其他类别.数据不平衡会使得分类问题中的分类面过于倾向于适应大类而忽略小类,导致测试样本被错误地判断为大类.针对该问题,文章提出了一种平衡化图半监督学习方法.该方法在能量函数中引入均衡化因子项,使得置信值不仅在图上尽量光滑且在不同类别之间也尽量均衡,有效减小了数据不均衡的不利影响,21个标准数据集上对比实验的统计分析结果表明新方法在数据不平衡时具有显著(显著性水平为0.05)优于支持向量机以及其他图半监督学习方法的分类效果.In many real applications of machine learning, there are class imbalance problems, which occurs when the number of one class is much lower than the ones of the other classes. In the framework of imbalanced data set, classifiers would tend to be biased toward the majority class and ignore the minority ones. It may cause samples of minority class being misclassified as majority class ones. Aiming at this problem, this paper proposes balanced graph based semi-supervised learning method (BGSSL). This method introduced an equilibrium factor of classes to energy equation to promise class confidence to be as sooth as possible on graph as well as be as balanced as possible over different classes. It is expected to effectively alleviates the decay of imbalance problem. Statistical analysis of experiments on twenty one datasets demonstrates that BGSSL can provide significantly (significance level of 0.05) better results than SVM and other graph based semi-supervised learning methods on imbalanced datasets.

关 键 词:不均衡数据集 图半监督学习 支持向量机 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象