检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘亚慧 杨浩苹 李正华[1] 张民[1] LIU Yahui;YANG Haoping;LI Zhenghua;ZHANG Min(School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China)
机构地区:[1]苏州大学计算机科学与技术学院,江苏苏州215006
出 处:《中文信息学报》2020年第4期10-20,共11页Journal of Chinese Information Processing
基 金:国家自然科学基金(61525205,61876116);江苏高校优势学科建设工程资助项目
摘 要:作为主流的浅层语义表示形式,语义角色标注一直是自然语言处理领域的研究热点之一。目前学术界已有的语义角色标注规范(PropBank规范和北大规范)主要存在三个问题:①基于片段的论元表示让标注难度加大;②PropBank中谓词框架的定义难度较大;③北大规范缺乏省略论元的标注。经过充分调研,该文尝试融合已有的中英文语义角色标注规范的优点,同时结合实际标注中遇到的问题,制定了一种轻量级的适合非语言学背景的标注者参与的中文语义角色标注规范。第一,采用基于词的论元表示,避免了片段边界的确定,从而降低标注难度;第二,标注者直接根据句子上下文信息,标注谓词相关论元角色,而无须预先定义每个谓词的所有语义框架;第三,显式标注句子中省略的核心论元,更准确地刻画句子的语义信息。此外,为了保证标注一致性和提高数据标注质量,规范针对各种复杂语言现象,给出了明确的优先级规定和难点分析。As the main formalism of shallow semantic parsing,semantic role labeling is one of the hot research topics in natural language processing(NLP).There are three main problems in current existing annotation guidelines(i.e.,the PropBank annotation guideline and the Peking University guideline).First,the span-based argument representation complicates the annotation process.Second,it is difficult to define the frames of the predicates in the PropBank annotation guideline.Third,the Peking University guideline does not annotate omitted arguments.Through thorough investigation of existing Chinese and English annotation guidelines,we develop a lightweight annotation guideline for Chinese semantic role labeling suitable for ordinary annotators by combining the advantages of existing guidelines and considering the real problems during our annotation process.First,we choose the word-based argument representation to avoid determination of span boundary and thus reduce annotation difficulty.Second,annotators can directly annotate the arguments of a predicate word according to the sentential context information,without pre-defining all semantic frames of the predicate word.Third,we explicitly annotate the omitted core arguments to more precisely describe the semantic information of sentences.Additionally,in order to ensure the annotation consistency and improve the quality of annotation,the proposed guideline gives clear priority and difficulty analysis for various complex linguistic phenomena.
关 键 词:语义角色标注 标注规范 浅层语义分析 论元角色 谓词
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.3