基于融合文本情感转换的在线评论过采样方法  

A Oversampling Method for Online Reviews Based onFusion Text Sentiment Transfer

在线阅读下载全文

作  者:赵长欣 苌道方[1] ZHAO Chang-xin;CHANG Dao-fang(School of Logistics Engineering,Shanghai Maritime University,Shanghai 201306,China)

机构地区:[1]上海海事大学物流工程学院,上海201306

出  处:《计算机技术与发展》2024年第12期108-115,共8页Computer Technology and Development

摘  要:近年来,在线评论区已经被“好评返现”“刷评”等控评手段的滥用所破坏,对在线评论情感分析模型在真实应用场景中的性能造成了严重影响。对此,提出了一种基于融合文本情感转换的在线评论过采样方法以缓解上述样本数量分布失衡引起的问题。该方法融合了基于特征词典的方法与基于深度学习的方法实现对文本的情感转换。对于多数类样本中的显式情感表达,该方法采用基于特征词典的方法识别并完成替换。同时,基于深度学习的方法搭建了Seq2Seq模型并引入了掩码自注意力机制,用于替换文本中的隐式情感表达。最后采用限制性EDA方法对结果进一步扩充,作为少数类的过采样样本。通过在采集的真实在线评论数据集上进行实验,结果表明该方法使训练出的模型获得了16.6%的精确率和9.5%的F1值提高,同时对少数样本的分辨能力提高了12.2%。其相较传统方法对所训练的模型同样有更好的性能提升。In recent years,the online review section has been destroyed by the abuse of"praise cashback"and"review review",which has seriously affected the performance of online review sentiment analysis model in real application scenarios.Therefore,we propose an online review oversampling method based on fusion text sentiment transfer.This method combines feature dictionary-based and deep learning-based methods to achieve text sentiment transfer.Feature dictionary-based method was used to identify and replace explicit sentiment expression in most class samples.At the same time,the deep learning-based approach replaces the implicit sentiment expression that the former cannot hit by adding mask self-attention mechanism to the Seq2Seq model.Then,the restrictive EDA method is used to further expand the text as an enhanced text for a few class samples.The experimental results on the real data set show that the accuracy and F1 value of the proposed method are improved by 16.6%and 9.5%respectively,and the model's resolution to minority samples is improved by 12.2%.Compared with the traditional method,it also has better performance improvement for the trained model。

关 键 词:文本情感转换 不平衡在线评论 特征词典 掩码自注意力 Seq2Seq模型 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象