结合Tri-training半监督学习和凸壳向量的SVM主动学习算法  被引量:6

Active Learning Algorithm of SVM Combining Tri-training Semi-supervised Learning and Convex-Hull Vector

在线阅读下载全文

作  者:徐海龙[1] 龙光正[1] 别晓峰[1] 吴天爱[1] 郭蓬松[1] 

机构地区:[1]空军工程大学防空反导学院,西安710051

出  处:《模式识别与人工智能》2016年第1期39-46,共8页Pattern Recognition and Artificial Intelligence

基  金:国家自然科学基金项目(No.61273275)资助~~

摘  要:为解决监督学习过程中难以获得大量带有类标记样本且样本数据标记代价较高的问题,结合主动学习和半监督学习方法,提出基于Tri-training半监督学习和凸壳向量的SVM主动学习算法.通过计算样本集的壳向量,选择最有可能成为支持向量的壳向量进行标记.为解决以往主动学习算法在选择最富有信息量的样本标记后,不再进一步利用未标记样本的问题,将Tri-training半监督学习方法引入SVM主动学习过程,选择类标记置信度高的未标记样本加入训练样本集,利用未标记样本集中有利于学习器的信息.在UCI数据集上的实验表明,文中算法在标记样本较少时获得分类准确率较高和泛化性能较好的SVM分类器,降低SVM训练学习的样本标记代价.The large-scale labeled samples can not be acquired easily and the cost of sample labeling is high. Aiming at these problems, an active learning algorithm of support vector machine (SVM) based on tri-training semi-supervised learning and convex-hull vector is proposed in this paper. Semi-supervised learning and active learning are efficiently combined. Firstly, by calculating the convex-hull vector of the sample set, samples of convex-hull vector which are most likely to be support vectors are selected to be labeled. For the existing active learning, the unlabeled samples are no longer used after the most informative samples are selected to be labeled. Secondly, to salve this problem, semi-supervised learning method-based tri-training is introduced into SVM active learning. Thus, the unlabeled samples with higher confidence level of classifying samples are selected and classified as the training sample set, and the useful information for learning machines in the unlabeled samples is exploited. The experimental results on UCI dataset show that the proposed algorithm achieves higher classification accuracy with less labeled samples and it improves generalization performance and reduces the labeling cost of SVM training.

关 键 词:主动学习 半监督学习 支持向量机(SVM) 凸壳向量 Tri—training算法 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象