检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王益成 蒋星宇 郑彦宁[2] WAMG Yicheng;JIANG Xingyu;ZHENG Yanning(School of Management Science and Engineering,Anhui University of Finance and Economics,Bengbu 233030,China;Institute of Scientific and Technical Information of China,Beijing 100038China)
机构地区:[1]安徽财经大学管理科学与工程学院,安徽蚌埠233030 [2]中国科学技术信息研究所,北京100038
出 处:《情报科学》2024年第9期51-60,共10页Information Science
基 金:中国博士后科学基金资助项目“面向决策支持的科技报告数据挖掘与分析方法体系构建研究”(2021M703124)
摘 要:【目的/意义】科技报告数据是国家基础性战略资源,研究对其开发和利用的技术和方法势在必行。通过识别生物技术领域的研究主题及其演化过程,能够填补科技报告数据的开发和利用场景。【方法/过程】构建生物技术领域科技报告文本语料库,训练BERTopic主题模型,进行领域研究主题挖掘与演化研究。【结果/结论】基于BERTopic主题模型共识别出生物技术领域30个主题,通过主题层次聚类法解析了生物技术领域9大研究方向,即植物基因组学和基因改造、基因工程和工业生物技术、生物技术在生物和生态环境中的应用、兽医病毒学和免疫学、分子遗传学和生物化学、心血管代谢健康及神经生物学、骨生物学和再生医学、生物医学和临床研究。【创新/局限】所构建模型能够更好地识别科技报告数据中所呈现的研究主题,生成的生物技术领域主题描述文本质量较好。语料库对科技报告数据中的摘要和时间字段进行语义分析,并未对其他字段进行分析。【Purpose/significance】Scientific and technological report data is a fundamental strategic resource for the country,and it is imperative to research the technologies and methods for its development and utilization.By identifying the research themes in the field of biotechnology and their evolution,we can fill the gap in the development and application of scientific report data.【Method/process】A text corpus of scientific reports in the biotechnology field was constructed,and a BERTopic model was trained to explore and analyze the evolution of research themes in the field.【Results/conclusion】A total of 30 themes in the biotechnology field were identified based on the BERTopic model.The hierarchical clustering of themes revealed nine major research directions in biotechnology:plant genomics and genetic modification,genetic engineering and industrial biotechnology,the application of biotechnology in biological and ecological environments,veterinary virology and immunology,molecular genetics and biochemistry,cardiovascular metabolic health and neurobiology,bone biology and regenerative medicine,and biomedical and clinical research.【Innovation/limitation】The constructed model can better identify the research themes presented in scientific report data,and the quality of the generated thematic descriptions in the field of biotechnology is satisfactory.The corpus performs semantic analysis on the abstracts and time fields in the scientific report data but does not analyze other fields.
关 键 词:科技报告 生物技术 BERTopic模型 主题挖掘 主题演化
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7