检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李俊飞 徐黎明 汪洋[1,2] 魏鑫 LI JunFei;XU LiMing;WANG Yang;WEI Xin(Computer Network Information Center,Chinese Academy of Sciences,Beijing 100083,China;School of Computer Science and Technology,University of Chinese Academy of Sciences,Beijing 100049,China)
机构地区:[1]中国科学院计算机网络信息中心,北京100083 [2]中国科学院大学,计算机科学与技术学院,北京100049
出 处:《数据与计算发展前沿》2023年第4期86-100,共15页Frontiers of Data & Computing
基 金:中国科学院态势感知运行维护与应用支持项目(WX1450201-0105-02)。
摘 要:【目的】科技文献引文分类是学术影响力评估、文献检索推荐等的基础工作。随着深度神经网络和预训练语言模型的发展,科技文献引文分类研究取得巨大成果。学界提出了许多基于深度学习技术的科技文献引文分类方法、模型和数据集。然而,目前仍然缺乏对现有方法和最新趋势的全面调研,因此本文在这方面进行了探索。【方法】本文梳理了基于深度学习技术的科技文献引文分类模型、数据集,并对不同模型的分类性能进行了对比和分析;归纳了不同模型的优缺点,对科技文献引文分类技术进行总结;讨论了未来的发展方向,并提出了建议。【结果】预训练语言模型能够有效地学习全局语义表示,改善了RNNs(Recurrent Neural Networks)训练效率低、CNNs(Convolutional Neural Networks)提取文本序列依赖特征长度有限等问题,显著提高了分类准确率。【局限】本文以介绍科技文献引文分类技术的进展为主,没有对未来技术的发展方向进行全面预测。[Objective]The citation classification of scientific and technological literature is the basic work of academic influence evaluation and literature retrieval and recommendation.With the development of deep neural networks and pre-trained language models,the research on citation classification of scientific and technological literature has achieved great success.Many citation classification models,data sets,and methods for scientific and technological documents based on deep learning technology have been proposed in the literature.However,there is still a lack of comprehensive research on existing methods and the latest trends.This paper makes up for this gap.[Methods]This paper studies the citation classification model and data set of scientific and technological literature based on deep learning technology, compares and analyzes the performance of different models as well as their advantages and disadvantages, summarizes the citation classification technology for scientific and technological literacy, and discusses the future development direction. [Results] The classification model based on the pre-trained language model can effectively learn the global semantic representation, improve the problems of low training efficiency of RNNs (Recurrent Neural Networks) and limited length of dependent features of text sequences extracted by CNNs (Convolutional Neural Networks), and significantly improve the classification accuracy. [Limitations] This paper mainly introduces the progress of citation classification technology in scientific and technological literature, and does not comprehensively predict the development direction of technology in the future.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222