检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:曾兰兰 王以松[1] 陈攀峰 ZENG Lanlan;WANG Yisong;CHEN Panfeng(College of Computer Science and Technology,Guizhou University,Guiyang Guizhou 550025,China)
机构地区:[1]贵州大学计算机科学与技术学院,贵阳550025
出 处:《计算机应用》2022年第10期3011-3017,共7页journal of Computer Applications
基 金:国家自然科学基金资助项目(U1836205)。
摘 要:正确识别裁判文书中的实体是构建法律知识图谱和实现智慧法院的重要基础。然而常用的命名实体识别(NER)模型并不能很好地解决裁判文书中的多义词表示和实体边界识别错误的问题。为了有效提升裁判文书中各类实体的识别效果,提出了一种基于联合学习和BERT的BiLSTM-CRF(JLB-BiLSTM-CRF)模型。首先,利用BERT对输入字符序列进行编码以增强词向量的表征能力;然后,使用双向长短期记忆(BiLSTM)网络建模长文本信息,并将NER任务和中文分词(CWS)任务进行联合训练以提升实体的边界识别率。实验结果表明,所提模型在测试集上的精确率达到了94.36%,召回率达到了94.94%,F1值达到了94.65%,相较于BERT-BiLSTM-CRF模型分别提升了1.05个百分点、0.48个百分点和0.77个百分点,验证了JLB-BiLSTM-CRF模型在裁判文书NER任务上的有效性。Correctly identifying the entities in judgment documents is an important foundation for building legal knowledge graph and realizing smart courts. However, commonly used Named Entity Recognition(NER) models cannot solve the problem of polysemous word representation and entity boundary recognition errors in judgment document well. In order to effectively improve the recognition effect of various entities in the judgment documents, a Bidirectional Long Short-Term Memory with a sequential Conditional Random Field(BiLSTM-CRF) based on Joint Learning and BERT(Bidirectional Encoder Representation from Transformers)(JLB-BiLSTM-CRF) model was proposed. Firstly, the input character sequence was encoded by BERT to enhance the representation ability of word vectors. Then, the long text information was modeled by BiLSTM network, and the NER tasks and Chinese Word Segmentation(CWS) tasks were jointly trained to improve the boundary recognition rate of entities. Experimental results show that this model has the precision of 94. 36%, the recall of 94. 94%, and the F1 score of 94. 65% on the test set, which are 1. 05 percentage points, 0. 48 percentage points and 0. 77percentage points higher than those of BERT-BiLSTM-CRF model respectively, verifying the effectiveness of JLB-BiLSTM-CRF model in NER tasks for judgment documents.
关 键 词:裁判文书 双向长短期记忆网络 BERT 联合学习 命名实体识别
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222