一种中文人名识别的训练架构  被引量:1

A Training Framework for Chinese Name Recognition

在线阅读下载全文

作  者:王嘉文 王传栋[1] 杨雁莹[2] WANG Jia-wen;WANG Chuan-dong;YANG Yan-ying(School of Computer and Software,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;Nanjing Forest Police College,Nanjing 210023,China)

机构地区:[1]南京邮电大学计算机学院,江苏南京210023 [2]南京森林警察学院,江苏南京210023

出  处:《计算机技术与发展》2018年第7期53-57,62,共6页Computer Technology and Development

基  金:中央高校基本科研业务费专项资金项目(LGZD201502;LGYB201603)

摘  要:中文人名识别作为中文语言处理的一项关键技术,广泛应用于文本挖掘、语义分析、机器翻译等领域。在数据日趋海量化和异构化的当今社会,对于中文人名进行命名实体识别已经成为现阶段中文自然语言处理的研究热点之一。由于现有方法大多依赖于先验的领域知识和工程化的特征,识别模型常需要研究人员的大量语言学知识。为了减少甚至忽略对这些工程化的特征的依赖,旨在建立一种较为灵活的深度神经网络架构,通过对大规模未标记语料的内部表示的学习,使得系统减少甚至忽略这些工程化特征的影响,采用无监督的方法进行中文人名识别。实验结果表明,该模型不但性能良好,而且不需要过多的计算资源,在中文人名识别的应用中具有良好的效果。Chinese name recognition,as a key technology in Chinese language processing,is widely used in text mining,semantic analysis,machine translation and other fields.The data are becoming massive and heterogeneous in today 's society,so the named entity recognition for Chinese names has become one of the hotspots of Chinese natural language processing at this stage. Identification model often requires a large number of linguistic knowledge of the researchers because most of the existing methods rely on transcendental domain knowledge and engineering characteristics.In order to reduce or even ignore the dependence on these engineering features,we aim to establish a more flexible deep neural network architecture which can be through the large-scale unmarked corpus of the internal representation of learning,making the system reduce or even ignore the impact of these engineering features and using the unsupervised method for Chinese name recognition.Experiment shows that the model not only has excellent performance but also does not need too much computing resources,with good effect in the Chinese name recognition application.

关 键 词:自然语言处理 深度学习 神经网络 中文人名识别 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象