区块链下社交网络用户抄袭识别方案  

User plagiarism identification scheme in social network under blockchain

在线阅读下载全文

作  者:李莉 杨春艳 朱江文 胡荣磊 LI Li;YANG Chunyan;ZHU Jiangwen;HU Ronglei(College of Electronic and Communication Engineering,Beijing Electronic Science and Technology Institute,Beijing 100071,China;College of Computer Science and Technology,Xidian University,Xi’an Shaanxi 710071,China)

机构地区:[1]北京电子科技学院电子与通信工程系,北京100071 [2]西安电子科技大学计算机科学与技术学院,西安710071

出  处:《计算机应用》2024年第1期242-251,共10页journal of Computer Applications

基  金:中央高校基本科研业务费专项资金资助项目(3282023017)。

摘  要:针对社交网络中用户抄袭难以识别的问题,为保障原创作者权益并对具有抄袭行为的用户进行追责,提出了区块链下社交网络用户抄袭识别方案。针对现有区块链缺少通用溯源模型的问题,设计基于区块链的溯源信息管理模型来记录用户操作信息,为文本相似度检测提供依据。在Merkle树和布隆过滤器结构的基础上,设计了新的索引结构BHMerkle,减少了区块构建和查询时的计算开销,实现了对交易的快速定位。同时提出多特征权重Simhash算法,提高了词权计算的准确性并提高签名值匹配阶段的效率,从而对具有抄袭行为的恶意用户进行识别,并通过奖惩机制遏制恶意行为的发生。抄袭识别方案在不同主题的新闻数据集上的平均准确率为94.8%,平均召回率为88.3%,相较于多维度Simhash算法和基于信息熵加权的Simhash(E-Simhash)算法,平均准确率分别提升了6.19、4.01个百分点,平均召回率分别提升了3.12、2.92个百分点。实验结果表明,所提方案在抄袭文本的查询及检测效率方面均有所提升,且在抄袭识别方面具有较高的准确性。To address the problem of difficulty in identifying user plagiarism in social networks and to protect the rights of original authors while holding users accountable for plagiarism actions,a plagiarism identification scheme for social network users under blockchain was proposed.Aiming at the lack of universal tracing model in existing blockchain,a blockchain-based traceability information management model was designed to record user operation information and provide a basis for text similarity detection.Based on the Merkle tree and Bloom filter structures,a new index structure BHMerkle was designed.The calculation overhead of block construction and query was reduced,and the rapid positioning of transactions was realized.At the same time,a multi-feature weighted Simhash algorithm was proposed to improve the precision of word weight calculation and the efficiency of signature value matching stage.In this way,malicious users with plagiarism cloud be identified,and the occurrence of malicious behavior can be curbed through the reward and punishment mechanism.The average precision and recall of the plagiarism detection scheme on news datasets with different topics were 94.8%and 88.3%,respectively.Compared with multi-dimensional Simhash algorithm and Simhash algorithm based on information Entropy weighting(E-Simhash),the average precision was increased by 6.19 and 4.01 percentage points respectively,the average recall was increased by 3.12 and 2.92 percentage points respectively.Experimental results show that the proposed scheme improves the query and detection efficiency of plagiarism text,and has high accuracy in plagiarism identification.

关 键 词:区块链 抄袭识别 Simhash算法 相似度检测 社交网络 

分 类 号:TP309.2[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象