一种基于FastText-Transformer的微博作者身份识别  

Weibo Authorship Identification Based on FastText-Transformer

在线阅读下载全文

作  者:蔡满春[1] 陈政 何泉 CAI Manchun;CHEN Zheng;HE Quan(School of Information and Cyber Security,People's Public Security University of China,Beijing 100038,China;Suzhou Public Security Bureau,Suzhou 215011,China)

机构地区:[1]中国人民公安大学信息网络安全学院,北京100038 [2]苏州市公安局,江苏苏州215011

出  处:《中国人民公安大学学报(自然科学版)》2025年第1期54-59,共6页Journal of People’s Public Security University of China(Science and Technology)

基  金:中国人民公安大学2022年基本科研业务费课题(2022JKF02009)。

摘  要:随着网络文本的快速增长和社交媒体的普及,识别文本作者身份的需求日益增加,对来源追溯、网络安全以及社会管理等领域具有重要意义。而针对自媒体庞大且语义灵活的中文网络短文本作者身份识别仍然存在很大挑战。为实现自动化特征提取,提高识别准确率,通过基于深度学习框架和改进FastText模型,提升词向量表示质量,将FastText模型输出的词向量输入到改进的Transformer Encoder模型中,提升了分类质量。实验结果表明提出的算法模型对微博数据集文本作者身份识别准确率达92.3%,可以实现微博作者身份识别。With the rapid growth of network text and the popularity of social media,the demand is increasing for accurately identifying the author identity of text,which is of great significance to the fields of source traceability,network security and social management.However,there are still great challenges in identifying the authors of Chinese network essays with the vast and semantically flexible we-media.To automate feature extraction and improve the recognition accuracy,the FastText model is improved by the deep learning framework to increase the quality of the word vector representation.The output of FastText model is input into the improved Transformer Encoder model to increase the classification quality.Experimental results demonstrate that the proposed algorithm achieves an accuracy of 92.3%in identifying the authorship of Weibo dataset texts,effectively completing the task of authorship identification in Weibo.

关 键 词:作者识别 FastText模型 Transformer模型 

分 类 号:D035.39[政治法律—政治学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象