多特征融合的教育资源标签生成算法  被引量:1

A Multi-feature Fusion Algorithm for Label Generation of Educational Resources

在线阅读下载全文

作  者:李雯 文勇军[1,2] 唐立军[1,2] LI Wen;WEN Yong-jun;TANG Li-jun(School of Physical & Electric Science, Changsha University of Science & Technology, Changsha 410114, China;Hunan Province Higher Education Key Laboratory of Modeling and Monitoring on the Near-earth Electromagnetic Environments, Changsha University of Science & Technology, Changsha 410114, China)

机构地区:[1]长沙理工大学物理与电子科学学院,湖南长沙410114 [2]长沙理工大学近地空间电磁环境监测与建模湖南省普通高校重点实验室,湖南长沙410114

出  处:《计算机与现代化》2020年第9期19-24,共6页Computer and Modernization

基  金:国家科技支撑计划课题(2014BAH08F04);湖南省重点研发计划项目(2018GK2054);湖南省教育厅科学研究项目(17k004);湖南省研究生科研创新项目(CX2018B575);近地空间电磁环境监测与建模湖南省高校重点实验室开放基金资助项目(N201907)。

摘  要:利用标签的形式简单有效地对教育资源进行准确描述,对互联网中杂乱、庞大的教育资源进行高效分类,能使用户便捷地浏览和获取教育资源信息并提高教育资源的利用率。自然语言处理中生成文本标签的方法有很多种,但特征描述不全面,因此需要研究多特征融合的标签生成方法。本文结合中文文本的特点,在TextRank算法基础上,加入TF-IDF权重和位置信息权重,考虑词语在语料库中的信息及在文章中的位置信息,生成包括语料库信息和位置信息的标签,形成多特征融合的标签生成算法。测试结果及分析表明,多特征融合后的标签生成算法最高F值为0.571,其平均值为0.34,优于常用的TextRank算法和TF-IDF算法,有效提高了教育资源标签质量,有利于教育资源更好的利用和管理。In the form of tags,educational resources can be accurately described in a simple and effective way,and the messy and huge educational resources in the Internet can be classified efficiently,so that users can browse and obtain educational resource information conveniently and the utilization rate of educational resources is improved.There are many methods to generate text tags in natural language processing,but the description of features is not comprehensive.Therefore,the method of label generation for multi-feature fusion is studied.Combining with the characteristics of Chinese text,adding TF-IDF weights and location information weights on the basic of TextRank algorithm,considering the information of words in the corpus and the position information in the article,the labels including corpus information and position information are generated to form a multi-feature fusion algorithm for label generation.The test results and analysis show that the maximum F-measure value of the improved TextRank algorithm is 0.571 and its average value is 0.34,which is better than the commonly TextRank algorithm and TF-IDF algorithm,and the improved TextRank algorithm can effectively improve the quality of educational resource labels,which is beneficial to better utilization and management of educational resources.

关 键 词:教育资源标签 TextRank算法 TF-IDF算法 标签生成 算法改进 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象