基于ALBERT的中文简历命名实体识别  被引量:4

Recognition of named entity in Chinese resume based on ALBERT

在线阅读下载全文

作  者:余丹丹 黄洁 党同心[2] 张克 YU Dan-dan;HUANG Jie;DANG Tong-xin;ZHANG Ke(School of Cyber Science and Engineering,Zhengzhou University,Zhengzhou 450003,China;College of Data Target Engineering,PLA Strategic Support Force Information Engineering University,Zhengzhou 450001,China)

机构地区:[1]郑州大学网络空间安全学院,河南郑州450003 [2]战略支援部队信息工程大学数据目标工程学院,河南郑州450001

出  处:《计算机工程与设计》2024年第1期261-267,共7页Computer Engineering and Design

基  金:国家自然科学基金项目(62071490)。

摘  要:现有的电子简历实体识别方法准确率低,采用BERT预训练语言模型虽能取得较高的准确率,但BERT模型参数量过大,训练时间长,其实际应用场景受限,提出一种基于ALBERT的中文电子简历命名实体识别方法。通过轻量版ALBERT语言模型对输入文本进行词嵌入,获取动态词向量,解决一词多义的问题;使用BiLSTM获取上下文结构特征,深层次挖掘语义关系;将拼接后的向量输入到CRF层进行维特比解码,学习标签间约束关系,输出正确标签。实验结果表明,该方法在Resume电子简历数据集中取得了94.86%的F1值。The existing electronic resume entity recognition method shows low accuracy rate.Although the BERT pre-training language model can achieve a high accuracy rate,the BERT model has too many parameters,long training time,and limited practical application scenarios.A named entity recognition method for Chinese electronic resumes based on ALBERT was proposed.The input text was embedded through the lightweight version of ALBERT language model,dynamic word vectors were obtained,and the problem of polysemy was solved.The BiLSTM was used to obtain context structure features and deeply mine semantic relationships.The spliced vector was inputted to the CRF layer for Viterbi decoding,the constraint relationship between labels was learned,and the correct label was outputted.Experimental results show that the method achieves 94.86%F1 value in the Resume electronic resume dataset.

关 键 词:电子简历 命名实体识别 预训练语言模型 双向长短时记忆网络 条件随机场 神经网络 深度学习 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象