基于不变特征的同义词替换文本水印算法  

Synonym Substitution Text Watermarking Algorithm Based on Invariant Features

在线阅读下载全文

作  者:方静[1] 金彪[1] 宋考[1] 颜西山 熊金波[1,2] FANG Jing;JIN Biao;SONG Kao;YAN Xishan;XIONG Jinbo(College of Computer and Cyber Security,Fujian Normal University,Fuzhou 350117,China;Fujian Provincial Key Laboratory of Network Security and Cryptology,Fuzhou 350117,China)

机构地区:[1]福建师范大学计算机与网络空间安全学院,福建福州350117 [2]福建省网络安全与密码技术重点实验室,福建福州350117

出  处:《福建师范大学学报(自然科学版)》2025年第3期9-18,共10页Journal of Fujian Normal University:Natural Science Edition

基  金:国家自然科学基金项目(62272102);福建省自然科学基金重点项目(2023J02014);福建省自然科学基金项目(2023J01531)。

摘  要:针对目前文本水印方法中存在的隐蔽性不足、鲁棒性较差等问题,提出基于不变特征的鲁棒性文本水印算法:OurSyntactic和OurKeyword。首先,利用句法分析器SuPar识别句子中心词和KeyBert算法提取句子的关键词作为不变特征;其次,基于不变特征选择相邻单词进行替换,并使用BERT模型生成替换候选集,进一步评估词语之间语义相似性来过滤候选集;最后,设计基于哈希运算的水印编码函数用于水印提取与验证。相较于现有方案,所提算法能嵌入指定水印编码的同时,在应对单词替换、插入和删除等文本攻击时展现出更强的鲁棒性。具体表现为OurSyntactic和OurKeyword水印算法的比特错误率与基准实验相比,分别降低了12%和9%。Aiming to address the problems of insufficient concealment and poor robustness in existing text watermarking methods,this paper proposes a robust text watermarking algorithm based on invariant features,namely OurSyntactic and OurKeyword watermarking algorithms.Firstly,the syntactic parser SuPar is used to identify the central words of sentences,and the KeyBert algorithm is employed to extract keywords as invariant features.Secondly,adjacent words are selected for substitution based on these invariant features,and a replacement candidate set is generated using the BERT model to further evaluate the semantic similarity between words and filter the candidate set.Finally,a hash-based watermark encoding function is designed for watermark extraction and verification.Compared with existing schemes,the proposed algorithm can embed the specified watermarking codes while demonstrating greater robustness against text attacks such as word substitution,insertion and deletion.Specifically,the bit error rates of the OurSyntactic and OurKeyword watermarking algorithms are reduced by 12%and 9%,respectively,compared with benchmark experiments.

关 键 词:语言模型 不变特征 词汇替换 文本水印 鲁棒性 

分 类 号:TP309.2[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象