基于一致性训练的半监督虚假招聘广告检测模型  被引量:4

Semi-supervised fake job advertisement detection model based on consistency training

在线阅读下载全文

作  者:王瑞琪 纪淑娟[1] 曹宁 郭亚杰 WANG Ruiqi;JI Shujuan;CAO Ning;GUO Yajie(Shandong Provincial Key Laboratory of Wisdom Mine Information Technology(Shandong University of Science and Technology),Qingdao Shandong 266590,China)

机构地区:[1]山东省智慧矿山信息技术重点实验室(山东科技大学),山东青岛266590

出  处:《计算机应用》2023年第9期2932-2939,共8页journal of Computer Applications

基  金:国家自然科学基金资助项目(71772107)。

摘  要:虚假招聘广告的泛滥不仅会损害求职者的合法权益,还会扰乱正常的就业秩序,造成求职者极差的用户体验。为了有效检测出虚假招聘广告,提出一种基于一致性训练的半监督虚假招聘广告检测模型(SSC)。首先,对所有数据应用一致性正则项提升模型的性能;然后,通过联合训练的方式整合有监督损失和无监督损失得到半监督损失;最后,使用半监督损失对模型进行优化。在两个真实数据集EMSCAD(EMployment SCam Aegean Dataset)和IMDB(Internet Movie DataBase)上的实验结果表明,SSC在标签数据仅为20时取得了最好的检测效果,准确率与现有先进的半监督学习模型UDA(Unsupervised Data Augmentation)相比提升了2.2和2.8个百分点,与深度学习模型BERT(Bidirectional Encoder Representations from Transformers)相比提升了3.4和11.7个百分点,同时还具有较好的可拓展性。The flood of fake job advertisements will not only damage the legitimate rights and interests of job seekers but also disrupt the normal employment order,which results in a poor user experience for job seekers.To effectively detect fake job advertisements,an SSC(Semi-Supervised fake job advertisements detection model based on Consistency training)was proposed.Firstly,the consistency regularization term was applied on all the data to improve the performance of the model.Then,supervised loss and unsupervised loss were integrated through joint training to obtain the semi-supervised loss.Finally,the semi-supervised loss was used to optimize the model.Experimental results on two real datasets EMSCAD(EMployment SCam Aegean Dataset)and IMDB(Internet Movie DataBase)show that SSC achieves the best detection performance when the labeled data are only 20,and the accuracy is increased by 2.2 and 2.8 percentage points compared with the existing advanced semi-supervised learning model UDA(Unsupervised Data Augmentation),and is increased by 3.4 and 11.7 percentage points compared with the deep learning model BERT(Bidirectional Encoder Representations from Transformers).At the same time,SSC has good scalability.

关 键 词:虚假信息检测 半监督学习 网络招聘 虚假招聘广告 一致性训练 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象