检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:钟昕妤 李燕[1] ZHONG Xin-yu;LI Yan(School of Information Engineering,Gansu University of Traditional Chinese Medicine,Lanzhou 730101,China)
机构地区:[1]甘肃中医药大学信息工程学院,甘肃兰州730101
出 处:《软件导刊》2023年第2期225-230,共6页Software Guide
基 金:甘肃中医药大学研究生创新基金项目(2022CX137)。
摘 要:中文分词作为实现机器处理中文的一项基础任务,是近几年的研究热点之一。其结果对后续处理任务具有深远影响,具备充分的研究意义。通过对近5年分词技术研究文献的综合分析,明晰后续研究将以基于神经网络模型的融合方法为主导,进一步追求更精准高效的分词表现。而在分词技术的发展与普及应用中,亦存在着制约其性能的各项瓶颈。除传统的歧义和未登录词问题外,分词还面临着语料规模质量依赖和多领域分词等新难题,针对这些新问题的突破研究将成为后续研究的重点之一。As a basic task of machine processing, Chinese word segmentation is one of the research hotspots in recent years. The results have a far-reaching impact on the follow-up processing tasks, and are of full research significance. Through the comprehensive analysis of the research literature on word segmentation technology in the past five years, it is clear that the follow-up research will be dominated by the fusion method based on neural network model, and further pursue more accurate and efficient word segmentation performance. In the development and application of word segmentation technology, there are also various bottlenecks restricting its performance. In addition to the traditional ambiguity and unknown words, word segmentation is now faced with new problems such as corpus scale and quality dependence and multi-domain word segmentation. The breakthrough research on these new problems will become one of the focuses of the follow-up research.
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.248