基于深度学习的术语识别研究综述  被引量:2

Review of Term Recognition Studies Based on Deep Learning

在线阅读下载全文

作  者:阮光册[1] 钟静涵 张祎笛 Ruan Guangce;Zhong Jinghan;Zhang Yidi(Department of Information Management,East China Normal University,Shanghai 200062,China)

机构地区:[1]华东师范大学信息管理系,上海200062

出  处:《数据分析与知识发现》2024年第4期64-75,共12页Data Analysis and Knowledge Discovery

摘  要:【目的】梳理深度学习模型在术语识别中的研究现状与面临挑战。【文献范围】在中国知网和Web of Science中,分别以主题=“术语识别”+“术语抽取”、主题=“(extract terms OR term recognition OR technology detection OR relation classification)AND deep learning AND ner”作为检索式进行检索,共筛选73篇文献进行述评。【方法】对基于深度学习的术语识别一般框架、模型的选择及各模型的优缺点、未来发展趋势进行综述。【结果】基于深度学习的术语识别方法可划分为使用单一神经网络模型、复合神经网络模型和结合深度学习模型的术语识别三大类。从方法使用来看,以BiLSTM-CRF为核心及延伸的模型是术语识别的主流方法;BERT及BERT的优化模型是近年来的研究热点;在特定领域倾向于使用多任务模型代替神经网络模型;迁移学习以及主动学习的应用成为新的研究方向。【局限】仅对已有研究的不同模型及训练结果进行结构化分析,缺少对不同模型在同一数据集上的训练效果对比,待未来进一步研究。【结论】基于深度学习的术语识别未来可在术语标注模式、融合术语的多维特征、小数据集或零数据集的术语识别技术、跨领域模型泛化、结果可解释性和完善评价方法等方面深入研究。[Objective]This paper reviews the current developments and challenges facing term recognition studies based on deep learning.[Coverage]We searched the中国知网and the Web of Science using queries of主题=“术语识别”+“术语抽取”,and subject=“(extract terms OR term recognition OR technology detection OR relation classification)AND deep learning AND ner”.A total of 73 articles were retrieved.[Methods]We reviewed these studies on the general framework of deep learning-based term recognition,model selection,advantages and disadvantages of various models,and future development trends.[Results]Deep learning-based term recognition methods can be categorized into three major types:single neural network models,composite neural network models,and models combining deep learning.BiLSTM-CRF models are the mainstream method for term recognition,with BERT and its optimized models being recent research hotspots.In specific domains,multi-task models are preferred over neural network models,and the application of transfer learning and active learning has become a new research direction.[Limitations]We only conducted a structured analysis of different models and training results of existing studies,lacking a comparison of training effects of different models on the same dataset,requiring further research in the future.[Conclusion]Future research in deep learning-based term recognition should focus on term annotation patterns,integrating multidimensional features of terms,term recognition techniques for small or zero datasets,cross-domain model generalization,interpretability of results,and improvement of evaluation methods.

关 键 词:术语识别 深度学习 文本挖掘 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程] G35[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象