基于PaECTER-BERTopic与大模型的专利技术主题识别及演化分析——以生成式人工智能领域为例  

Patent Technology Topic Identification and Evolution Analysis Based on PaECTER-BERTopic and Large Model:A Case Study of Generative Artificial Intelligence

在线阅读下载全文

作  者:黄怡 隗玲[1] 张凯 HUANG Yi;WEI Ling;ZHANG Kai(School of Information,Shanxi University of Finance and Economics,Taiyuan 030006,P.R.China)

机构地区:[1]山西财经大学信息学院,太原030006

出  处:《数字图书馆论坛》2025年第2期1-11,共11页Digital Library Forum

基  金:国家自然科学基金青年科学基金项目“基于多视角科技知识图谱融合的新兴技术演化路径识别与预测方法研究”(编号:72304176)资助。

摘  要:为解决目前专利文本向量化表征效果不佳、专利技术主题识别结果可解释性不够等问题,提出一种基于PaECTER专利预训练语言模型、BERTopic与大模型的专利技术主题识别及演化分析方法。首先,采用PaECTER专利预训练语言模型对专利文本进行向量化表示;其次,基于BERTopic模型结合KeyBERT对专利技术主题进行识别,并使用GPT-4o大模型对技术主题进行体系化分析;再次,基于PaECTER对专利技术主题进行相似度关联计算,生成专利技术演化路径;最后,以生成式人工智能领域为例,验证所提方法的有效性。实验结果表明,对比传统的BERTopic模型,所提方法提高了专利技术主题的可解释性、一致性和多样性,实现了准确的专利技术演化路径识别,同时揭示了生成式人工智能领域技术的发展状态和演进路径,为相关领域研究提供理论参考。To solve the current problems of poor vectorized representation of patent texts and insufficient interpretability of patent technology topic identification results,a method of patent technology topic identification and evolution analysis based on PaECTER patent pre-trained language model,BERTopic,and large model is proposed.Firstly,the PaECTER patent pre-trained language model is used to vectorize the patent texts.Secondly,the patent technology topics are identified based on the BERTopic model combined with KeyBERT,and the systematic analysis is carried out on the identified patent technology topics using the GPT-4o large model.Then,the similarity correlation calculation is performed on the patent technology topics based on PaECTER to generate the patent technology evolution path.Finally,taking the domain of generative artificial intelligence as an example,we verify the effectiveness of the proposed method.The experimental results show that compared with the traditional BERTopic model,the method proposed in this paper improves the interpretability,consistency,and diversity of patent technology topics,realizes the accurate identification of patent technology evolution path,and at the same time reveals the development status and evolution trend of technologies in the domain of generative artificial intelligence,which can provide theoretical reference for related research.

关 键 词:专利文本 技术主题识别 技术演化分析 PaECTER-BERTopic 大模型 

分 类 号:G350.7[文化科学—情报学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象