基于版权认证的文本匹配模型研究  

Research on text matching model based on copyright authentication

作  者:刘晓飞 莫秀良[1] LIU Xiaofei;MO Xiuliang(Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology,School of Computer Science and Engineering,Tianjin University of Technology,Tianjin 300384,China)

机构地区:[1]天津理工大学、计算机科学与工程学院、天津市智能计算及软件新技术重点实验室,天津300384

出  处:《天津理工大学学报》2025年第1期90-96,共7页Journal of Tianjin University of Technology

基  金:国家自然科学基金(面上—联合基金)(U1536122);科技部“科技助力经济2020”重点专项(SQ2020YFF0413781);天津市科委重大专项(15ZXDSGX00030);天津市教委科研计划重点项目(2018ZD11)。

摘  要:面对网络中日益增多的数字作品以及人们版权意识的增强,确认数字作品版权归属非常重要,对于数字作品原创性检测问题,文本匹配技术能够很好地解决这一问题。文本匹配技术通过算法来判断句子之间的语义是否相近。最近几年,深度学习迅速发展,解决文本匹配任务的方法也得到了很好的发展。在已有的基于核的文档排序神经模型(a kernel based neural model for document ranking, KNRM)上进一步地研究和创新,提出融合KNRM和轻量级梯度提升机(light gradient boosting machine, LightGBM)算法的文本匹配模型,在交互矩阵转化的直方图上采用kernel-pooling的方式来提取相关局部特征信息,引入K个不同大小的核函数,来捕捉不同细粒度的相关匹配信号,获取高斯核特征,将LightGBM算法作为分类器,进行分类处理工作,预测最后的匹配结果。通过多个数据集验证模型效果,实验表明,融合模型KNRM-LightGBM在准确率方面优于原模型KNRM,能够达到更好的文本匹配效果。In the face of the increasing number of digital works on the Internet and the enhancement of people′s awareness of copyright,it is very important to confirm the ownership of the digital works copyright.Text matching technology can solve the problem of originality detection of digital works.Text matching technology uses the algorithms to determine whether sentences are semantically similar.In recent years,the deep learning has developed rapidly,as to have methods for solving text matching tasks.Based on the existing a kernel based neural model for document ranking(KNRM),a text matching model combining KNRM and LightGBM algorithm is proposed.The kernel-pooling method is adopted to extract relevant local feature information on the histogram of interaction matrix transformation.K kernel functions of different sizes are introduced to capture the correlation matching signals of different fine granularity and obtain the Gaussian kernel features.LightGBM algorithm is used as a classifier to classify and predict the final matching results.Validate model effects across multiple data sets.Experiments show that the fusion model KNRM-LighTGBM is superior to the original model KNRM in terms of accuracy,and can achieve better text matching effect.

关 键 词:文本匹配 基于核的文档排序神经模型 轻量级梯度提升机 数字版权 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象