检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:曹春萍[1] 武婷 CAO Chun-ping;WU Ting(School of Optical-electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200082,China)
出 处:《计算机技术与发展》2019年第11期1-6,共6页Computer Technology and Development
基 金:国家自然科学基金(61402288);上海市自然科学基金(15ZR1429100)
摘 要:现如今互联网上出现了很多评论性文章,这些文章字符数多,且包含较多与主题无关的信息,会影响后续的文本分析任务的性能。因此,针对传统的解决方案不能够对多主题长文本进行建模,以及现有的神经网络无法从相对较长的时间步长中捕获语义关联等问题,文中提出了一种结合单层神经网络和分层长短记忆网络的深度网络模型,并在长文本过滤任务中进行应用。该模型通过词语层LSTM网络获得句子内部词语之间的关系并得到具有语义的句向量,然后将句向量输入主题依赖度计算模型和句子层LSTM网络模型,进而得到句子与各主题类别的依赖度以及待过滤句子与其他句子之间的关联。最后,在从马蜂窝获取的游记数据集上进行的实验表明,该模型相比SVM、朴素贝叶斯、LSTM、Bi-LSTM等效果更好。Nowadays,there are a lot of critical articles on the Internet.These articles have more characters and contain more information irrelevant to the topic,which will affect the performance of subsequent text analysis tasks.The traditional solution cannot model multi-theme long text,and the existing neural network cannot capture the semantic association from a relatively long time step.Therefore,we propose a deep network model combining single-layer neural networks and layered long and short memory networks.The model obtains the relationship between the internal words of the sentence through the word layer LSTM network and obtains the semantic vector.Then the sentence vector is input into the subject dependence calculation model and the sentence layer LSTM network model,in turn,the dependence of the sentence and each topic category and the relationship between the sentence to be filtered and other sentences are obtained.Finally,experiments on the travel data set acquired from the Mafengwo shows that this model is superior to SVM,Naive Bayes,LSTM,Bi-LSTM.
关 键 词:长文本过滤 多主题 语义关联 LSTM 分层模型
分 类 号:TP31[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7