结合关键词指导与大语言模型近端策略优化的专利关键句抽取  

Patent key sentence extraction combining keyword guidance and proximal policy optimization of large language models

在线阅读下载全文

作  者:万甜 吕学强[1] 马登豪 WAN Tian;LV Xueqiang;MA Denghao(Beijing Key Laboratory of Internet Culture and Digital Dissemination Research,Beijing Information Science&Technology University,Beijing 100192,China)

机构地区:[1]北京信息科技大学网络文化与数字传播北京市重点实验室,北京100192

出  处:《北京信息科技大学学报(自然科学版)》2025年第1期20-29,共10页Journal of Beijing Information Science and Technology University(Science and Technology Edition)

基  金:国家自然科学基金项目(62171043);北京市自然科学基金项目(4232025);青海省创新平台建设专项(2022-ZJ-T02)。

摘  要:针对现有专利关键句抽取方法对标注数据依赖性强以及大语言模型训练成本高的问题,提出了一种结合关键词指导(keyword guidance,KG)与大语言模型近端策略优化(proximal policy optimization,PPO)的关键句抽取方法KG-PPO。首先,构建关键词-关键句对,并利用联合匹配模型评估其相关度,进而加权生成奖励值;其次,引入库尔贝克-莱布勒(Kullback-Leibler)散度衡量训练模型与参考模型之间的差异,同时结合状态价值网络对当前状态的价值分数进行估计;最后,通过近端策略优化指导模型,实现关键句的精准抽取。实验结果表明,该方法在关键句抽取任务中的表现优于对比模型,其中在专利数据集上,ROUGE-L指标较DiffuSum方法提升了1.92百分点,BLEU-4指标提升了1.20百分点,关键句抽取效果显著增强,验证了方法的有效性。To address the issues of high dependency on labeled data in existing patent key sentence extraction methods and high training cost of large language models,a key sentence extraction method named KG-PPO,which combines keyword guidance(KG)and proximal policy optimization(PPO)of large language models,was proposed.Firstly,keyword-key sentence pairs were constructed,and a joint matching model was used to evaluate their relevance,thereby generating weighted reward values.Secondly,Kullback-Leibler(KL)divergence was introduced to measure the difference between the training model and the reference model,and meanwhile the value score of the current state was estimated with the state value network.Finally,proximal policy optimization was applied to guide the model to achieve precise extraction of key sentences.Experimental results demonstrate that the proposed method outperforms comparative models in the key sentence extraction task.Specifically,on the patent dataset,the ROUGE-L score increases by 1.92 percentage points,and the BLEU-4 score increases by 1.20 percentage points compared with the DiffuSum method,indicating that the effectiveness of key sentence extraction is significantly enhanced and the method's effectiveness is validated.

关 键 词:关键句抽取 大语言模型 联合匹配模型 近端策略优化 中文专利 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象