基于DBSCAN和随机森林的单词记忆检索难度预测研究  

Research on word memory retrieval difficulty prediction based on DBSCAN and random forest

在线阅读下载全文

作  者:傅小倞 罗正军[1] 杨艺豪 郑祝倩 FU Xiaoliang;LUO Zhengjun;YANG Yihao;ZHENG Zhuqian(School of Economics and Management,Nanjing University of Aeronautics and Astronautics,Nanjing 210000,China)

机构地区:[1]南京航空航天大学经济与管理学院,江苏南京210000

出  处:《现代电子技术》2023年第21期105-110,共6页Modern Electronics Technique

基  金:基于循数治理的可再生能源电力消纳机制及政策研究(ND2021002)。

摘  要:单词记忆检索是指人在记忆中搜索到一个单词的过程,是单词学习的重要方面。目前关于单词记忆检索的研究非常有限。文中提出一种基于DBSCAN聚类和随机森林的单词记忆检索难度预测模型,通过仿真程序和机器学习算法,从单词自身特征出发预测单词记忆检索的难度。首先开发一个单词记忆检索仿真程序,根据仿真程序结果以及单词的字母组成结构、词性、使用频率构造特征向量,训练一个随机森林回归模型集用于预测单词记忆检索难度七维向量表达。另外,使用DBSCAN聚类算法获取单词难度标签,在此基础上训练一个随机森林分类预测模型用于预测单词的难度分类。实验结果表明:回归模型集平均拟合优度R2值达到了0.906;分类预测模型准确率达到了0.985;模型整体具有较好的鲁棒性。Word memory retrieval refers to the process of finding a word in memory,which is an important aspect of word learning.Currently,the research on word memory retrieval is very limited.In view of this,a word memory retrieval difficulty prediction model based on DBSCAN clustering and random forest is proposed.Simulation programs and machine learning algorithms are used to predict the word memory retrieval difficulty based on the characteristics of words.A word memory retrieval simulation program is developed first,and then,according to the results of the simulation program,the letter composition structure,the part of speech and the usage frequency of words,feature vectors are constructed,and a random forest regression model set is trained to predict the seven⁃dimensional vector expression of the word memory retrieval difficulty.In addition,DBSCAN clustering algorithm is used to obtain the word difficulty label,and on this basis,a random forest classification prediction model is trained to predict the difficulty classification of words.The experimental results show that the mean goodness of fit R²value of regression model sets reaches 0.906,the accuracy rate of the classification prediction model reaches 0.985,and the proposed model has good robustness as a whole.

关 键 词:记忆检索 DBSCAN聚类 特征向量 回归模型 分类预测 拟合优度 鲁棒性 

分 类 号:TN911.1-34[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象