基于Bi-LSTM-6Tags的智能中文分词方法被引量：6

Smart Chinese word segmentation method based on Bi-LSTM-6Tags

作　　者：王玮 WANG Wei(Graduate School,Academy of Military Sciences,Beijing 100091,China)

出　　处：《计算机应用》2018年第A02期107-110,共4页journal of Computer Applications

摘　　要：针对当前基于深度学习模型中文分词算法中存在的语义理解不全和词位信息不足的问题,提出了基于双向长短期记忆(Bi-LSTM)神经网络模型的六词位标注集中文分词方法。首先,利用双向长短期记忆神经网络模型自动发现文本特征;然后,通过六词位标注集从文本深层语义上高效准确完成中文分词任务;最后,通过第二国际汉语分词评测(SIGHAN)提供的Backoff2005语料集进行实验验证,在相同实验条件下,该方法与条件随机场(CRF)方法、单向长短期记忆神经网络方法、双向长短期记忆神经网络四词位方法进行比较,分别可以提高分词准确率3%、4%、1%,从而证明该中文分词方法是合理和有效的。In view of the problem of incomplete semantic understanding and insufficient word information in the Chinese word segmentation algorithm based on the depth learning model,this paper proposed a six-word-position-based tagging method based on Bidirectional Long Short-Term Memory(Bi-LSTM)neural network model.Firstly,the text features were automatically discovered by using a Bi-LSTM deep learning neural network.Then,the six-word-position-based tagging method was used to complete the middle segmentation task efficiently and accurately from the deep semantic meaning of the text.Finally,through SIGHAN(the Second International Chinese word segmentation evaluation),the Backoff2005 corpus is provided by the experimental verification.Under the same experimental conditions,the method and CRF(Conditional Random Field)method,the LSTM(long short memory neural network),and the Bi-LSTM four word position method can improve the accuracy of word segmentation by 3%,4%and 1%respectively.It proves that the Chinese word segmentation method proposed in this paper is reasonable and effective,and the accuracy of segmentation is improved.

关键词：双向LSTM 六词位标注中文分词

分类号：TP391.1[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Bi-LSTM-6Tags的智能中文分词方法被引量：6

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Bi-LSTM-6Tags的智能中文分词方法 被引量：6

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于Bi-LSTM-6Tags的智能中文分词方法被引量：6