检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:柴悦 赵彤洲[1] 江逸琪 高佩东 CHAI Yue;ZHAO Tongzhou;JIANG Yiqi;GAO Peidong(School of Computer Science and Engineering,Wuhan Institute of Technology,Wuhan 430205,China)
机构地区:[1]武汉工程大学计算机科学与工程学院,湖北武汉430205
出 处:《武汉工程大学学报》2020年第5期575-580,共6页Journal of Wuhan Institute of Technology
基 金:国家自然科学基金(61601176);武汉研究院开放性课题(IWHS20192031)。
摘 要:针对LSTM网络进行主题词提取时因没有考虑中心词的下文对主题词的影响而导致提取准确率低的问题,提出了一种双向LSTM引入Attention机制模型(Att-iBi-LSTM)的主题词提取方法。首先利用LSTM模型将中心词的上文和下文信息在两个方向上建模;然后在双向LSTM模型中引入注意力机制,为影响力更高的特征分配更高的权重;最后利用softmax层将文档中的词分为主题词或非主题词。并且还提出了一种两阶段模型训练方法,即在自动标注的训练集上进行预训练之后,再利用人工标注数据集训练模型。实验在体育、娱乐和科技3种新闻文本上进行主题词提取任务,实验结果表明本文提出的Att-iBi-LSTM模型与SVM、TextRank和LSTM相比F1值分别提高了13.78%、24.31%和3.32%,使用两阶段训练方法的Att-iBi-LSTM比一阶段训练的F1值提高了1.56%。Aiming at the problem of low recognition accuracy of topic words extraction due to the lack of partial contextual information based on long short-term memory(LSTM),we presented a bi-directional LSTM network introduced attention mechanism model(Att-iBi-LSTM)for topic words extraction.First,the LSTM model was used to model the context information of the central word in two directions.Then,the attention mechanism was introduced to assign higher weight to the significant features.Finally,the words in the document were divided into topic words or non-topic words by using softmax layer.We also proposed a two-stage model training method,that is,after pre-training on the automatically labeled training set,the model is manually trained using the labeled data set.The topic words extraction task was performed on three types of news texts:sports news,entertainment news and scientific news.Experimental results show that the Att-iBi-LSTM model improves the F1-measure by 13.78%,24.31%and 3.32%respectively compared with models support vector machine,TextRank,and LSTM.The F1-measure of Att-iBi-LSTM model is 1.56%higher than that of one-stage training.
关 键 词:LSTM Attention机制 主题词提取
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7