检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:金贤日 欧石燕[1] Hyonil Kim;Ou Shiyan(School of Information Management,Nanjing University,Nanjing 210023,China)
出 处:《数据分析与知识发现》2021年第1期66-77,共12页Data Analysis and Knowledge Discovery
基 金:国家社会科学基金重点项目(项目编号:17ATQ001)的研究成果之一。
摘 要:【目的】探索施引文献中引用文本自动识别方法,并比较不同类型引用句在内容上的差别。【方法】提出一种无监督引用文本识别方法,通过比较候选句与施引文献和被引文献的文本相似度确定隐性引用句。为了精确计算文本相似度,提出向量空间模型与词嵌入模型相结合的两种文档向量模型。【结果】分别对两篇高被引论文约200篇施引文献中的隐性引用句进行了识别,本文方法的F值均达到92%以上。通过对显性引用句和隐性引用句的内容进行比较,发现两者在引用功能和情感上有明显区别:表达研究背景和技术基础的隐性引用句比例要高于显性引用句,而表达研究基础和研究比较的隐性引用句比例要低于显性引用句;45.3%的显性引用句为正面引用,而78.8%的隐性引用句为中性引用。【局限】仅对句子层面的引用文本进行识别,在短语层面的引用文本识别还有待于进一步探索。【结论】在识别引用文本时有必要识别隐性引用句,本文提出的引用文本识别方法性能较高。[Objective] This paper proposes a method to automatically identify citation texts and compare the contents of citation sentences. [Methods] We developed an unsupervised method to find the implicit citation sentences and then compared the similarity of these sentences and the citing/cited papers. We combined the vector space and the word embedding models to calcuate the similarity precisely. [Results] We identified the implicit citation sentences of two higly-cited papers from 200 citing articles and found the proposed method’s F-value was above 92%. By comparing the contents of the explicit and implicit citaiton senstences, we noticed their significant difference in citation functions and sentiments. There were more implicit citation sentences for research background and technical basis than the explicit ones. There were also fewer implicit citation sentences for research basis and comparison than the explicit ones. 45.3% of the explicit citation sentences were positive references while 78.8% of implicit citation sentences were neutral. [Limitations] We only investigated citation texts at sentence level. More research is needed to discuss the clause and phrase-level identifications.[Conclusions] The proposed method could effectively identify implicit citation sentences.
分 类 号:TP393[自动化与计算机技术—计算机应用技术] G250[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145