检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:罗程多 赵耀[1] LUO Chengduo;ZHAO Yao(State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, 100876 , China)
机构地区:[1]北京邮电大学网络与交换技术国家重点实验室,北京100876
出 处:《网络新媒体技术》2019年第3期63-66,22,共5页Network New Media Technology
基 金:国家"863计划"项目"融合网络业务体系的开发"(2011AA01A102)
摘 要:数据驱动是当前机器学习和人工智能技术的一大特征。高质量、大规模的标注数据集是领域技术发展的根基。在自然语言处理领域,标注数据的质量和数量直接决定了某个语言处理任务是否标准化,方法模型能否在公平条件下被评估和比较。而语言数据的人工标注是一个十分繁琐和复杂的过程,其中涉及诸如标注质量、标注管理、标注效率等诸多问题。为了解决这些问题,研究者提出了大量语言标注的工具和框架。本文介绍了语言标注的基本理论和技术,并对主流的两个语言标注框架GATE和UIMA进行评述和比较。Data-driven is an important characteristic of current machine learning and artificial intelligence techniques. High-quality, large-scale datasets are fundamentals of technical developments in many domains. In the domain of natural language processing, the quality and quantity of annotated data is crucial for the standardization of language processing task and the fair evaluations and comparisons of models. The manual annotation of language data is a tedious and complex procedure since it may face the problems of quality control, management and efficiency etc. To address these, many annotation tools and frameworks are proposed. This paper introduces the major problems of linguistic annotation and reviews two mainstream frameworks GATE(General Architecture for Text Engineering) and UIMA(Unstructured Information Management Architecture).
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249