语义驱动的司法文档学习分类方法  被引量:2

Semantic-driven learning and classification method of judicial documents

在线阅读下载全文

作  者:马建刚 马应龙[4] MA Jiangang;MA Yinglong(Law School,Renmin University of China,Beijing 100872,China;National Prosecutors College of P. R. C. ,Beijing 102206,China;The People's Procuratorate of Henan Province,Zhengzhou Henen 450004,China;School of Control and Computer Engineering,North China Electric Power University,Beijing 102206,China)

机构地区:[1]中国人民大学法学院,北京100872 [2]国家检察官学院,北京102206 [3]河南省人民检察院,郑州450004 [4]华北电力大学控制与计算机工程学院,北京102206

出  处:《计算机应用》2019年第6期1696-1700,共5页journal of Computer Applications

基  金:国家重点研发计划项目(2018YFC0831404,2018YFC0830605);中国博士后科学基金资助项目(2016M591317)~~

摘  要:基于海量的司法文书进行的高效司法文档分类有助于目前的司法智能化应用,如类案推送、文书检索、判决预测和量刑辅助等。面向通用领域的文本分类方法因没有考虑司法领域文本的复杂结构和知识语义,导致司法文本分类的效能很低。针对该问题提出了一种语义驱动的方法来学习和分类司法文书。首先,提出并构建了面向司法领域的领域知识模型以清晰表达文档级语义;然后,基于该模型对司法文档进行相应的领域知识抽取;最后,利用图长短期记忆模型(Graph LSTM)对司法文书进行训练和分类。实验结果表明该方法在准确率和召回率方面明显优于常用的长短期记忆(LSTM)模型、多类别逻辑回归和支持向量机等方法。Efficient document classification techniques based on large-scale judicial documents are crucial to current judicial intelligent application, such as similar case pushing, legal document retrieval, judgment prediction and sentencing assistance. The general-domain-oriented document classification methods are lack of efficiency because they do not consider the complex structure and knowledge semantics of judicial documents. To solve this problem, a semantic-driven method was proposed to learn and classify judicial documents. Firstly, a domain knowledge model oriented to judicial domain was proposed and constructed to express the document-level semantics clearly. Then, domain knowledge was extracted from the judicial documents based on the model. Finally, the judicial documents were trained and classified by using Graph Long Short-Term Memory(Graph LSTM) model. The experimental results show that, the proposed method is superior to Long Short-Term Memory(LSTM) model, Multinomial Logistic Regression(MLR) and Support Vector Machine(SVM) in accuracy and recall.

关 键 词:司法大数据 领域知识模型 文本分类 智慧检务 图长短期记忆模型 

分 类 号:TP309[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象