关于Word2Vec文本分类效果若干影响因素的分析  被引量:4

Analysis of Several Influencing Factors on Word2Vec Text Classification Effect

在线阅读下载全文

作  者:谢庆恒 XIE Qingheng(National Library of China,Beijing 100081,China)

机构地区:[1]国家图书馆,北京100081

出  处:《现代信息科技》2024年第1期125-129,共5页Modern Information Technology

摘  要:Word2Vec向量模型参数众多,在不同情景下分类效果不一,分析其影响因素很有必要。从Word2Vec模型基本原理出发,分析讨论了预训练语料、词向量预训练参数以及分类模型参数三大因素对模型分类效果的影响。结果表明限定域预料效果好于广域预料;预训练参数中向量维度越大,效果越好,窗口大小存在最优值,分类算法影响不大;分类模型参数中学习率、激活函数、批次大小对模型分类效果影响较大,训练轮次相对较小。The Word2Vec vector model has numerous parameters,and its classification effect varies in different scenarios.It is necessary to analyze its influencing factors.Starting from the basic principles of the Word2Vec model,this paper analyzes and discusses the impact of three major factors of pre trained corpus,pre trained parameters of word vectors,and classification model parameters on the model's classification effect.The results indicate that the effect of limited domain prediction is better than that of wide domain prediction.And the larger the vector dimension in the pre trained parameters,the better the effect.There is an optimal value in window size,and the classification algorithm has little impact.The learning rate,activation function and batch size of the classification model parameters have a greater impact on the classification effect of the model,and the training round is relatively small.

关 键 词:Word2Vec 文本分类 模型效果 影响因素 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象