检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李胜男 曲维光[1,2] 魏庭新 周俊生[1] 顾彦慧 李斌[2] LI Shengnan;QU Weiguang;WEI Tingxin;ZHOU Junsheng;GU Yanhui;LI Bin(School of Computer and Electronic Information/School of Artificial Intelligence,Nanjing Normal University,Nanjing 210023,China;School of Chinese Language and Literature,Nanjing Normal University,Nanjing 210097,China;International College for Chinese Studies,Nanjing Normal University,Nanjing 210097,China)
机构地区:[1]南京师范大学计算机与电子信息学院/人工智能学院,南京210023 [2]南京师范大学文学院,南京210097 [3]南京师范大学国际文化教育学院,南京210097
出 处:《计算机工程与应用》2023年第5期289-296,共8页Computer Engineering and Applications
基 金:国家自然科学基金面上项目(61772278);江苏省高校哲学社会科学基金一般项目(2019JSA0220);国家社会科学基金面上项目(18BYY127)。
摘 要:“V+V”是现代汉语中的常见结构,能够形成兼语、连动等多种完全不同的句法结构,给句法和语义解析造成困难。针对“V+V”形成的句法结构类型和序列关系识别问题,设计并制定了一套语料库标注规范,以解决语料库中存在的“V+V”结构的嵌套标注问题,并据此构建起一个包含5 381个兼语句子、7 987个连动句子,以及1 212个兼语连动嵌套句子的“V+V”语料库。提出一个基于BiLSTM-CRF和多头注意力机制的模型,能够同时识别结构中的多个动词和名词的句法、语义角色。相比于以往只研究单项识别兼语或者连动结构,该模型不仅可以同时识别兼语结构、连动结构,还可以解决兼语连动嵌套结构的识别问题。实验结果表明:该方法能够很好地解决“V+V”序列关系的识别问题,在测试集语料上达到92.12%的F1值。“V+V”is one of the most common structures in modern Chinese. Due to the fact that noun and verb bear various semantic roles, many different types of grammatical structures such as serial verb structures and concurrent structures can be formed by“V+V”, which causes difficulties in syntactic and semantic parsing. To identify the syntactic types and sequential relations entailed in the structure, it firstly constructs a“V + V”corpus according to the designed nested structure annotation specification, which contains 5 381 concurrent sentences, 7 987 serial verb sentences and 1 212 concurrent serial verb nested sentences, then it proposes a model based on BiLSTM-CRF and multi-head attention to identity the structure’s grammatical type and the semantic types of its components. A unified framework is designed to identify the concurrent structures and serial verb structures. Besides, it can identify the nested structures which has not been addressed in previous works. The experimental results on the constructed corpus show that the proposed model can achieve better performance and the F1 value reaches 92.12%.
关 键 词:V+V序列关系 连动结构 兼语结构 中文抽象语义表示
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.149.249.140