检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:郭群 张华熊[1] 王波 王心怡 GUO Qun;ZHANG Huaxiong;WANG Bo;WANG Xinyi(School of Computer Science and Technology,Zhejiang Sci-Tech University,Hangzhou 310018,China;Hangzhou DtDream Technologies Co.,Ltd.,Hangzhou 310013,China)
机构地区:[1]浙江理工大学计算机科学与技术学院,浙江杭州310018 [2]杭州数梦工场科技有限公司,浙江杭州310013
出 处:《软件工程》2025年第2期6-9,26,共5页Software Engineering
基 金:浙江省科技厅“尖兵”“领雁”研发攻关计划项目(2024C01019,2022C01220)。
摘 要:针对现有方法对非结构文本中结构复杂的敏感个人信息实体无法有效识别的问题,提出一种基于内容和上下文的敏感个人信息实体识别方法。一方面,利用规则匹配检测具有可预测模式的敏感实体类型;另一方面,构建了一个基于词对关系分类架构(ELECTRA-W2NER,EW2NER)的实体关系分类识别模型,以检测模式复杂的敏感实体类型。EW2NER使用最新的ELECTRA(Efficiently Learning an Encoder that Classifies Token Replacements Accurately)模型实现词嵌入,并采取实体关系分类架构统一提取扁平型和重叠型的敏感个人信息实体。该模型在中文敏感数据集上取得了97.05%的F 1值,优于ExSense(Extract sensitive information from unstructured data)模型。Aiming at the problem that existing methods cannot effectively recognize sensitive personal information entities with complex structures in unstructured text,this paper proposes a content and context based sensitive personal information entity recognition method.On one hand,it employs rule matching to detect sensitive entity types with predictable patterns;on the other hand,it constructs an entity relationship classification and recognition model based on a word pair relationship classification architecture(ELECTRA-W2NER,EW2NER)to detect sensitive entity types with complex patterns.EW2NER utilizes the latest ELECTRA(Efficiently Learning an Encoder that Classifies Token Replacements Accurately)model for word embeddings and adopts an entity relationship classification architecture to systematically extract both flat and overlapping sensitive personal information entities.The proposed model achieves an F 1 score of 97.05%on a Chinese sensitive data set,surpassing the ExSense(Extract Sensitive Information from Unstructured Data)model.
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.62