基于Web知识的中文分词结果优化被引量：6

OPTIMISING CHINESE WORD SEGMENTATION BASED ON WEB KNOWLEDGE

机构地区：[1]上海应用技术学院计算机科学与信息工程学院,上海201418

出　　处：《计算机应用与软件》2015年第12期55-58,共4页Computer Applications and Software

摘　　要：随着人们在互联网上的活动越来越频繁,网络新词不断涌现。现有的中文分词系统对新词的识别效率并不高。对新词的识别效率直接影响分词的精度,也对互联网应用系统的服务质量产生影响。在分词系统分词结果的基础上,提出利用搜索引擎和百度百科等Web知识,结合统计和匹配实现新词识别的方法,进一步实现对系统原始分词结果的优化。实验数据表明,该方法能够有效识别网络新词并实现分词结果的优化。As people＇s activities on the Internet become more and more frequent,the new words on the web are constantly emerging. The recognition efficiency of existing Chinese word segmentation system is relatively low on new words. The identification efficiency on new words directly impacts the precision of word segment,as well as affects the services quality of internet applications. Based on the segmentation results of current word segmentation system,we propose an approach for implementing the new words recognition by using Web knowledge such as search engine and Baidupedia and combining the statistics and matching,which further realises the optimisation of primitive segmentation results of the system. Experimental data show that the proposed method can effectively identify the new Web words and achieves the optimisation of segmentation results.

关键词：中文分词未登录词网络新词搜索引擎分词优化

分类号：TP391.1[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Web知识的中文分词结果优化被引量：6

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Web知识的中文分词结果优化 被引量：6

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于Web知识的中文分词结果优化被引量：6