检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]武汉大学信息管理学院信息检索与知识挖掘研究所,武汉430072
出 处:《情报学报》2016年第5期530-538,共9页Journal of the China Society for Scientific and Technical Information
基 金:国家自然科学基金面上项目"面向词汇功能的学术文本语义识别与知识图谱构建"(项目编号:71473183);教育部人文社会科学基地重大项目"面向细粒度的网络信息检索模型及框架构建研究"(项目编号:10JJD630014)的研究成果之一
摘 要:学术文本的结构功能识别是学术文本章节层次的文本分类问题,其本质就是识别章节的结构功能。本文将基于段落的学术文本结构功能识别分为两个子问题:段落位置识别及基于段落投票的章节结构功能识别。在自动构建的大规模数据集上的实验结果表明,虽然基于段落的结构功能识别效果不如基于章节整体内容的识别,但仍然取得了不错的效果。本文结合实验结果着重分析了影响基于段落的识别效果的两个重要因素:段落长度及章节中段落数量,并在最后对学术文本结构功能识别的三个层次做了总结,指出了拟进一步探讨的问题和方向。The structure function recognition of academic text is a text categorization problem on section level, of which essence is to recognize the structure function of sections. In this paper, we have divide the paragraph-based recognition into two subtasks: the recognition of paragraph position and the structure function recognition based on majority voting by paragraphs in sections. Experiments were conducted on datasets constructed automatically. Though the results were not as good as the recognition based on section content, it proved that it is feasible to recognize structure function based on paragraph. Also we analyzed the reasons from the aspects of the length of paragraph and the number of paragraphs in sections. Finally, we summarized the research works of structure function recognition briefly and some potential application are recommended.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15