检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周彬彬 张宏军 张睿 冯蕴天 徐有为 ZHOU Bin-bin;ZHANG Hong-jun;ZHANG Rui;FENG Yun-tian;XU You-wei(School of Command and Control Engineering,Arm Engineering University of PLA,Nanjing 210007,China)
出 处:《计算机科学》2019年第B06期540-546,共7页Computer Science
摘 要:军事语料的识别和标注是军事语料库建设的关键。针对军事语料的实体,提出了一套统一的军语词性标记规范和军事语料标注规范,设计了一种基于军语词典的自动扩展的军事语料实体特征提取框架。该框架借助设计的高精分类器进行基本特征的选择和提取,结合军语的典型特征组成特征集,构建基于军语词典校正的特征空间,对军事语料进行实体识别之后按照指定的标注规范和词形标记规范进行军事语料实体的标注,构建一个较大规模的高质量军事语料库。实验表明,该框架可以较好地完成语料实体的识别和语料标注工作,有利于军事语料库的建设工作和认清其在军事上的广泛作用和应用前景。The key to build military corpus are the identification and the marking of military corpus.For the entities of military corpus,this paper put forward a set of unified army language part-of-speech tags specification and military corpus annotation specifications,and designed a kind of automatic extension of military corpora based on the military language dictionary entity framework feature extraction.With the help of high precision classifier,the framework selects and extracts the basic features,combined with the typical features of the language set,builds the feature space.Based on the language dictionary correction for military corpora entity recognition,according to the specified annotation standard and specification of morphological marker military annotation corpus entity,the framework builds a large-scale high-quality military corpus.Experiments show that the framework can better complete corpus entity recognition and corpus annotation of the work,to do the construction of military corpus work and to recognize its function and the application prospect of widely in the military.
关 键 词:军事实体标注 军语词性标记 特征提取 军事语料库
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.185