基于半监督学习的印刷套准识别方法被引量：1

A Printing Registration Identification Method Based on Semi-Supervised Learning

作　　者：陈伟[1] 简川霞[2] CHEN Wei;JIAN Chuan-xia(School of Art,Ningbo City College of Vocational Technology,Ningbo 315100,China;College of Electromechanical Engineering,Guangdong University of Technology,Guangzhou 510006,China)

机构地区：[1]宁波城市职业技术学院艺术学院,宁波315100 [2]广东工业大学机电工程学院,广州510006

出　　处：《数字印刷》2022年第2期52-60,共9页Digital Printing

基　　金：浙江省教育厅科研项目资助(No.Y202147591);广东省信息物理融合系统重点实验室项目(No.2016B030301008);广东工业大学青年基金重点项目(No.17QNZD001);大学生创新创业训练项目(No.yj202111845031)。

摘　　要：针对有标记的训练样本数量较少会降低印刷套准识别模型性能的问题,本研究提出了基于安全样本过采样预处理和协同训练的半监督方法,以提升识别模型的性能。首先采用k近邻方法识别训练集中的安全样本。在安全样本间进行过采样,生成新的训练集。然后采用Bootstrap采样方法将新的训练集分成三个子训练集,学习得到三个决策树子分类模型,不断对无标记样本进行预测,并将其加入到子训练集,更新子分类模型,直至其能稳定为止。集成子分类模型,形成最终分类模型。实验结果表明,本研究方法随着训练样本数量的增多,分类性能也逐渐提高。当训练样本数量为800时,其在测试集上的分类准确率Accuracy达到98%,召回率的几何平均数G-mean为99%,在同样数量的训练样本上,均高于实验中的其他方法。本研究方法可以有效利用无标记样本,提高印刷套准识别模型的性能,实现数量较少的训练集样本的印刷套准识别。A small number of labeled samples are utilized to train models for identifying printing registration,which degrades severely the model performance.To solve this problem,in this study,a novel method was proposed with the combination of an oversampling pretreatment of safe samples and a co-training semi-supervised method.Firstly,k-nearest neighbor method was used to identify safe samples in the training set.An oversampling operation was implemented to generate new synthetic samples among the safe samples.A new training set was generated by combining the original training set and new synthetic samples.The new training set was divided into three training subsets with Bootstrap sampling method.Decision trees as base classifiers were trained from the distribution of three training subsets,respectively.Unlabeled samples were continuously predicted and incorporated into the training subsets,which updates the performance of base classifiers.The process was terminated until the performance was stable.Three base classifiers were integrated into the final classification model for the printing registration recognition.The experimental results showed that the classification performance of the proposed method is gradually improved with the increasing number of training samples.When the number of training samples reaches 800,the proposed method achieves the best classification accuracy(Accuracy)and the geometry mean(G-mean)of recalls of samples on the test set.They are 98%and 99%,respectively,which are better than those achieved with other methods in the experiment.The proposed method can effectively exploit the distribution of unlabeled samples to improve the model performance,and realize printing registration recognition with a small number of training samples.

关键词：协同训练半监督学习印刷套准决策树

分类号：TP391.4[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于半监督学习的印刷套准识别方法被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于半监督学习的印刷套准识别方法 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于半监督学习的印刷套准识别方法被引量：1