基于改进TF-IDF算法的日本文学语料处理计算机系统方法研究  

Research on computer system method of Japanese literature corpus processing based on improved TF-IDF algorithm

在线阅读下载全文

作  者:魏海燕[1] 沈进[1] WEI Haiyan;SHEN Jin(Xi’an FANYI University,Xi’an 710105,China)

机构地区:[1]西安翻译学院,西安710105

出  处:《自动化与仪器仪表》2023年第1期162-165,共4页Automation & Instrumentation

基  金:陕西省社会科学界联合会、2022年度国际传播能力建设重点研究项目《日本文学中的中国形象研究与学生文化自信培养》(2022HZ0857);西安翻译学院名实践项目《日语翻译工作坊》(SJ19A03)。

摘  要:对日本文学进行语料处理,有助于快速提取具有一定价值的文本信息,从而方便阅读和理解。为此,基于深度学习算法,构建了日本文学语料处理模型。首先,利用改进TF-IDF算法进行情感语料分类;其次,结合卷积神经网络与自循环思想构建自循环CNN模型,以处理不等长语料的分类问题;最后,结合卷积神经网络与双向门控循环单元处理特定主题语料分类问题。综合上述内容,构建日本文学语料处理模型。经多次实验结果显示,该模型的分类准确率超过90%,表明该模型能够有效实现日本文学的语料处理。The processing of Japanese literature corpus is conducive to the rapid extraction of valuable text information, so as to facilitate reading and understanding. Therefore, based on the deep learning algorithm, a Japanese literature corpus processing model is constructed. Firstly, the improved TF-IDF algorithm is used to classify emotional corpus;Secondly, combining convolution neural network and self circulation idea, a self circulation CNN model is constructed to deal with the classification of unequal length corpus;Finally, convolutional neural network and bi-directional gated cyclic unit are combined to deal with the classification of topic specific corpus. Based on the above, a processing model of Japanese literary corpus is constructed. The experimental results show that the classification accuracy of the model is more than 90%, which shows that the model can effectively realize the corpus processing of Japanese literature.

关 键 词:TF-IDF算法 卷积神经网络 语料处理 日本文学 

分 类 号:TP273[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象