检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:钱小飞[1] QIAN Xiaofei(College of Liberal Arts,Shanghai University,Shanghai 200444,China)
机构地区:[1]上海大学文学院,上海200444
出 处:《浙江外国语学院学报》2019年第6期59-67,共9页Journal of Zhejiang International Studies University
摘 要:汉语名词短语的内部结构复杂,找出名词短语内部嵌套的最长名词性成分,有助于消解底层句法歧义,挖掘论元结构和语义关系。文章分析了汉语内层最长名词短语的多层级分布特征,指出数据稀疏、结构歧义和边界歧义是识别的难点,并提出了一种基于条件随机场模型和基本名词块提升规则的识别方法,取得了85.23%的结构正确率和78.71%的结构召回率。实验结果表明,上层结构误识、联合结构、“v n n”格式、De后主谓结构和特殊歧义序列等造成的歧义是制约识别效果的主要原因。解决这些问题需要更多句法语义知识的参与,如在词汇层面收录含v简单组块,在句法层面引入句法规则验证机制等。Chinese noun phrase has complex structures.Recognizing the nested inner maximal noun constituents is helpful in distinguishing ambiguity in bottom syntactic analysis,and analyzing argument structures and relations.This paper analyzes the multi-level distribution feature of inner Maximal Noun Phrase,and found that the data sparse problem,structural ambiguities,and boundary ambiguities are the difficulties for analysis.It advances a method of combining Conditional Random Field and promoting rules based on Nominal Base Chunk,and the experiment achieved 85.23%in precision and 78.71%in recall.The analysis shows that the ambiguity caused by the misrecognition of high-level structures,the structure of coordination,the“v n n”format,the subject-predicate structure after“De”,and the special ambiguous sequence are the main reasons for the restricted recognition effects.It needs more linguistic knowledge to solve the problem,such as including simple chunks with verbs in dictionary,and introducing syntactic authentication mechanism.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.30