检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:谢庆恒 XIE Qingheng(National Library of China,Beijing 100081,China)
机构地区:[1]国家图书馆,北京100081
出 处:《现代信息科技》2024年第1期125-129,共5页Modern Information Technology
摘 要:Word2Vec向量模型参数众多,在不同情景下分类效果不一,分析其影响因素很有必要。从Word2Vec模型基本原理出发,分析讨论了预训练语料、词向量预训练参数以及分类模型参数三大因素对模型分类效果的影响。结果表明限定域预料效果好于广域预料;预训练参数中向量维度越大,效果越好,窗口大小存在最优值,分类算法影响不大;分类模型参数中学习率、激活函数、批次大小对模型分类效果影响较大,训练轮次相对较小。The Word2Vec vector model has numerous parameters,and its classification effect varies in different scenarios.It is necessary to analyze its influencing factors.Starting from the basic principles of the Word2Vec model,this paper analyzes and discusses the impact of three major factors of pre trained corpus,pre trained parameters of word vectors,and classification model parameters on the model's classification effect.The results indicate that the effect of limited domain prediction is better than that of wide domain prediction.And the larger the vector dimension in the pre trained parameters,the better the effect.There is an optimal value in window size,and the classification algorithm has little impact.The learning rate,activation function and batch size of the classification model parameters have a greater impact on the classification effect of the model,and the training round is relatively small.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229