基于深度学习的数据科学招聘实体自动抽取及分析研究  被引量:15

Research of Automatic Extraction of Entities of Data Science Recruitment and Analysis Based on Deep Learning

在线阅读下载全文

作  者:王东波[1] 胡昊天 周鑫[2] 朱丹浩[3] Wang Dongbo;Hu Haotian;Zhou Xin;Zhu Danhao(Colledge of Information Science and Technology, Nanjing Agricultural University, Nanjing 210095;Department of Information Management, Nanjing University, Nanjing 210093;Department of Computer Science and Technology, Nanjing University, Nanjing 210093)

机构地区:[1]南京农业大学信息科学技术学院,南京210095 [2]南京大学信息管理学院,南京210093 [3]南京大学计算机科学与技术系,南京210093

出  处:《图书情报工作》2018年第13期64-73,共10页Library and Information Service

基  金:国家社会科学基金重大项目“情报学学科建设与情报工作未来发展路径研究”(项目编号:17ZDA291);江苏省普通高校学术学位研究生科研创新计划项目“引用内容分析--引文语义信息的自动挖掘(KYZZ16_0033)”研究成果之一

摘  要:[目的/意义]数据科学作为一个融合诸多领域的新兴交叉学科正在快速形成。从数据科学招聘的公告信息中,抽取出相应的实体知识不仅有助于从市场的角度了解数据科学的发展动态,而且有助于改进数据科学教学的内容。[方法/过程]基于各大招聘网站职位招聘公告,结合情报学的数据获取、标注和组织方法,构建数据科学招聘语料库并从中抽取相应的实体进行分析与研究。[结果/结论]在搜集到的11000篇经过标注的职位招聘公告语料的基础上,基于Bi-LSTM-CRF、CRF和Bi—LSTM模型,对数据科学招聘实体的抽取任务进行性能的对比,确定最终的数据科学招聘实体自动抽取模型,设计数据科学招聘实体自动抽取平台,并构建数据科学招聘实体网络。[ Purpose/significance] Data science is emerging as a new interdisciplinary field which combines many fields. Extracting the corresponding entities knowledge from the announcement information of data science recruitment can not only help to understand the development of data science from a market perspective, but also help to improve the content of data science teaching. [ nethod/process] Based on the recruitment announcement from the recruitment website, combining with information science data collection, annotation and organization methods, data science corpus was constructed and the corresponding entities from it were extracted. [ Result/conclusion] In the existing 11000 annotated data science corpus scale recruitment announcement, based on the Bi-LSTM-CRF, CRF and Bi-LSTM models, this paper compared the extraction performance of data science recruiting entities and fnally determined the final data science recruitment entities automatic extraction model, designed the data science recruitment entities automatic extraction platform, and built a data science recruitment entities network.

关 键 词:数据科学 条件随机场 深度学习 Bi-LSTM-CRF 

分 类 号:G255.1[文化科学—图书馆学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象