基于对比学习的高价值发明专利识别——以无线通信网络领域为例  

High-Value Invention Patent Identification Based on Contrastive Learning:Taking the Field of Wireless Communication Network as an Example

在线阅读下载全文

作  者:薛航 施国良[1] 陈挺 Xue Hang;Shi Guoliang;Chen Ting(Business School,Hohai University,Nanjing 211100)

机构地区:[1]河海大学商学院,南京211100

出  处:《情报杂志》2024年第9期179-187,共9页Journal of Intelligence

基  金:中央高校基本业务费项目“基于图数据库的水利知识图谱关键技术研究”(编号:B200207036)研究成果。

摘  要:[研究目的]在众多专利中准确高效识别高价值发明专利,不仅对中国知识产权战略实施具有推动作用,还有助于促进高价值发明专利的技术转化。[研究方法]针对领域专利文本利用不充分的问题,对使用Bert在无线通信网络领域专利文本上进行对比学习预训练,得到领域适应的Bert模型。然后,利用领域适应的Bert模型训练高价值发明专利识别模型,并在高价值发明专利识别模型的训练过程中使用过采样策略缓解正负样本不均衡的问题,改善模型的效果。[研究结论]在包含62 000份无线通信网络中国发明专利数据集上的实验结果显示,使用对比学习和过采样策略训练得到的模型在Accuracy指标值和Macro-F1指标值上分别达到了97%和0.93,相比于直接使用Bert分别提升了9.77%和0.19。[Research purpose]Accurately and efficiently identifying high-value invention patents among numerous patents not only promotes the implementation of China's intellectual property strategy,but also helps to encourage the technological transformation of high-value invention patents.[Research method]Firstly,in response to the issue of insufficient utilization of domain patent texts,Bert is pre-trained through contrastive learning of wireless communication network domain patent texts to obtain a domain adapted Bert model.Then,a domain adapted Bert model is used to train a high-value invention patent recognition model,and an oversampling strategy is used in the training process of the high-value invention patent recognition model to alleviate the problem of imbalanced positive and negative samples and improve the effectiveness of the model.[Research conclusion]The experimental results on a dataset containing 62000 Chinese invention patents for wireless communication networks show that the models trained using contrastive learning and oversampling strategies achieve 97%and 0.93 Accuracy and Macro-F1 index values respectively,increased by 9.77%and 0.19 respectively compared to the direct use of Bert.

关 键 词:高价值发明专利 专利识别 专利文本 专利价值评估 对比学习 无线通信网络 

分 类 号:G306[文化科学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象