检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:牟少霞 吕冰彩 MOU Shao-xia;LV Bing-cai(University of Perpetual Help System Dalta,Graduate School Eternal University,Manila 6015,Philippines;Shandong Education Enrollment Examination Institute,Jinan,Shandong 250011,China)
机构地区:[1]菲律宾永恒大学,菲律宾马尼拉6015 [2]山东省教育招生考试院,山东济南250011
出 处:《计算技术与自动化》2023年第3期85-89,95,共6页Computing Technology and Automation
摘 要:为提高敏感数据抽取效果,提出了融合注意力机制的人机交互信息半监督敏感数据抽取方法。融合类卷积以及人机交互注意力机制构建融合交互注意力机制双向长短词记忆(Bi-LSTM-CRF)模型,通过模型的类卷积交互注意力机制将敏感词转化为字符矩阵,采用Bi-LSTM对该矩阵进行编码获得敏感词字符级特点的分布式排列,通过Bi-LSTM对该分布式排列的二次编码获得敏感词上下文信息的隐藏状态,基于该隐藏状态通过类卷积注意力层与交互注意力层进行注意力加权,获得类卷积注意力矩阵与交互注意力矩阵,拼接两个矩阵得到双层注意力矩阵,利用交互注意力层门控循环单元升级双层注意力矩阵成新的注意力矩阵,经全连接降维获取敏感词对应的预测标签,实现人机交互信息半监督敏感数据抽取。实验结果说明:该方法可有效降低敏感数据抽取复杂度,具有较高的敏感数据抽取查全率。In order to improve the extraction effect of sensitive data,a semi-supervised sensitive data extraction method of human-computer interaction information integrating attention mechanism is proposed.Bi-LSTM-CRF model is constructed by integrating convolution and human-computer interaction attention mechanism.Sensitive words are transformed into character matrix through the convolution interaction attention mechanism of the model.Bi-LSTM is used to encode the matrix to obtain distributed arrangement of character level characteristics of sensitive words.Through the Bi-LSTM is sensitive to the distributed array secondary coding gain word context information hidden state,based on the hidden state of combining class convolution attention at close range for all the words of attention weight distribution on the word to get kind of convolution attention matrix,the matrix through the model the interaction layer focus attention for all of the sensitive word weight distribution,attention to obtain interaction matrix,convolution attention yourself matrix and interaction matrix using the class splicing into double attention matrix,using interactive gating circulation cell upgrade double attention attention layer matrix into new attention matrix,the matrix through the connection dimension reduction access to sensitive word corresponding forecast label,realize human-computer interaction information a semi-supervised sensitive data extraction.Experimental results show that this method can effectively reduce the complexity of sensitive data extraction and has a high recall rate of sensitive data extraction.
关 键 词:注意力机制 人机交互 半监督 敏感数据抽取 BiLSTM模型 CRF模型
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.222.135.39