针刺临床文献自然语言处理中术语的智能化标注和抽取方法  被引量:8

Automatic labeling and extraction of terms in natural language processing in acupuncture clinical literature

在线阅读下载全文

作  者:刘华云 韩晨静 熊婕 李海燕[2] 雷蕾[2] 刘保延[3] LIU Hua-yun;HAN Chen-jing;XIONG Jie;LI Hai-yan;LEI Lei;LIU Bao-yan(Graduate School of Tianjin University of TCM,Tianjin 301617,China;Institute of Information on Traditional Chinese Medicine,China Academy of Chinese Medical Sciences;China Academy of Chinese Medical Sciences,Beijing 100700)

机构地区:[1]天津中医药大学研究生院,天津301617 [2]中国中医科学院中医药信息研究所 [3]中国中医科学院,北京100700

出  处:《中国针灸》2022年第3期327-331,共5页Chinese Acupuncture & Moxibustion

基  金:北京市中医管理局-北京中医药科技发展资金项目:JJ-2020-86;中国中医科学院科技创新工程项目:CI2021A00501。

摘  要:分析针刺临床文献术语识别任务的特殊性,对比目前应用于中医药领域的3种命名实体识别(NER方法的优缺点,认为双向长短期记忆神经网络-条件随机场模型(Bi LSTM-CRF)能结合上下文信息,利用较少的特征规律完成NER,适合针刺临床文献的术语识别。在此模型基础上,提出针刺临床文献术语识别流程主要包括文献预处理、序列标注、模型训练及效果评价4个方面,为针刺临床文献术语结构化提供思路。The paper analyzes the specificity of term recognition in acupuncture clinical literature and compares the advantages and disadvantages of three named entity recognition(NER)methods adopted in the field of traditional Chinese medicine.It is believed that the bi-directional long short-term memory networks-conditional random fields(Bi LSTM-CRF)may communicate the context information and complete NER by using less feature rules.This model is suitable for term recognition in acupuncture clinical literature.Based on this model,it is proposed that the process of term recognition in acupuncture clinical literature should include 4 aspects,i.e.literature pretreatment,sequence labeling,model training and effect evaluation,which provides an approach to the terminological structurization in acupuncture clinical literature.

关 键 词:针刺临床文献 术语识别 命名实体识别 双向长短期记忆神经网络-条件随机场模型 

分 类 号:R245[医药卫生—针灸推拿学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象