检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈金传[1] 成志强 熊泽泉[1] 于亚秀[1] CHEN Jinzhuan;CHENG Zhiqiang;XIONG Zequan;YU Yaxiu(Library of East China Normal University,Shanghai 200241,China)
出 处:《情报科学》2024年第11期76-83,111,共9页Information Science
基 金:国家社会科学基金项目“面向学科交叉融合的信息资源服务创新体系研究”(23BTQ084)
摘 要:【目的/意义】图书馆借阅数据的变化反映了当年借阅者关注重点的变化,一定程度上能够体现整个社会的研究关注热点。本文通过大语言模型建立高校图书馆图书借阅预约数据各字段与社会热点之间的关系模型,探索借阅数据与社会热点之间的关系,辅助实现对一段时间内社会热点的分析。【方法/过程】首先,采用编码—解码的结构构建关于图书题名的分词模型,利用大型的分词数据集进行训练,获取原始词频,然后根据字段中的读者院系和索书号进行领域匹配,最后,从借阅次数、预约持续时间和所属领域三个角度对原始词频进行权重更新,得到最终的与社会热点有关的热点词云。【结果/结论】本文首先对分词模型进行了实验,实验表明本文算法在MSR、PKU、CTB6三个数据集上F值明显优于其他算法,其中,在CTB6分词数据集上,本文算法F值达到97.18,高于CRF算法3.15个百分点,加入领域优化后的分词算法在专业性较强的文本上分词的性能更好。然后本文对图书馆借阅数据和预约数据进行了实验分析,展现了基于领域分词优化的热点词云生成框架的先进性,实验表明本文算法生成的热点词与社会热点能建立一定联系。【创新/局限】本文研究了图书借阅数据和预约数据的字段特点,创新性地提出了基于BERT的领域分词优化借阅热点生成框架。虽然本文利用了图书馆的数据字段特性构建了热点词云生成框架并且优化了词云生成结果,但是对于热点词云生成的性能没有一个量化的指标,接下来需要进行更多的探索和研究。【Purpose/significance】The changes in library borrowing data reflect the key concerns of borrowers at that time,and to a certain extent,can reflect the research hotspots of the entire society.This article aims to establish a relationship model between various fields of book borrowing reservation data in university libraries and social hotspots through a large language model,explore the relationship between borrowing data and social hotspots,and assist in the analysis of social hotspots over a period of time.【Method/process】Firstly,a word segmentation model for book titles is constructed using an encoding decoding structure.A large word segmentation dataset is used for training to obtain the original word frequency.Then,domain matching is performed based on the reader's department and call number in the field.Finally,the weight of the original word frequency is updated from three perspectives:borrowing frequency,reservation duration,and domain,to obtain the final hot word cloud related to social hotspots.【Result/conclusion】This paper first conducted experiments on the segmentation model,and the experiments showed that the algorithm in this paper had a significantly better F-value than other algorithms on the MSR,PKU,and CTB6 datasets.Among them,on the CTB6 segmentation dataset,the F-value of the algorithm in this paper reached 97.18,which is 3.15 percentage points higher than the CRF algorithm.The segmentation algorithm with domain optimization performed better on texts with strong professionalism.Then this paper makes an experimental analysis of library borrowing data and reservation data,and shows the progressiveness of the hot word cloud generation framework based on domain segmentation optimization.The experiment shows that the hot word generated by the algorithm in this paper can establish a certain relationship with social hot spots【.Innovation/limitation】This article studies the field characteristics of book borrowing data and reservation data,and innovatively proposes a borrowing hotspot gen
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222