在线医疗社区问答文本的知识图谱构建研究  被引量:18

Knowledge Graph Construction of Online Medical Community Q&A Texts

在线阅读下载全文

作  者:廖开际[1] 黄琼影 席运江[1] LIAO Kai—ji;HUANG Qiong-ying;XI Yun—jiang(School of Business Administration,South China University of Technology,Guangzhou 510641,China)

机构地区:[1]华南理工大学工商管理学院,广东广州510641

出  处:《情报科学》2021年第3期51-59,75,共10页Information Science

基  金:国家自然科学基金项目“基于超网络的企业微博知识挖掘及整合方法研究”(71371077)。

摘  要:【目的/意义】针对医疗问答社区数据量大、规范性差、数据稀疏等特性,综合利用双向长短记忆神经网络(BiLSTM)、条件随机场(CRF)、双向门控循环单元(BiGRU)等深度学习模型,对社区文本的实体识别及关系抽取方法进行研究。【方法/过程】首先,对实体作了进一步细分,利用BiLSTM-CRF模型对BIO标注的数据集进行实体识别,实验发现细分实体比未细分实体在结果上表现更好;接着利用BiGRU-Attention模型抽取各实体间的关系,实验结果显示,该模型无论是在准确率、召回率还是F值上都比BiLSTM-Attention抽取模型有较大的提升;最后利用Neo4j图数据库构建了一个可视化的知识图谱。【结果/结论】本研究将非结构化的社区文本转化为结构化数据,在医疗社区的智能知识服务、知识表示、个性化知识推荐等方面具有推动作用。【创新/局限】在医疗实体识别过程中将实体进行细分,成功构建了基于在线医疗社区问答文本的乳腺癌知识图谱。但由于某些关系样本量较少,对整体关系抽取的评价指标存在一定的影响。【Purpose/significance】This paper studies the Knowledge Graph construction method of the medical question and answer community. Aiming at the large amount of data, poor standardization and sparse data of the question-and-answer community, this paper comprehensively uses the bidirectional long-term memory neural network, conditional random field, bidirectional gated recurrent unit and other models to study the Entity Recognition and Relation Extraction methods of community text.【Method/process】Firstly,the entity is further subdivided. The bidirectional long-term memory neural network and the conditional random field model(BiLSTM-CRF) are used to identify the data set of the BIO. The experiment finds that the segmented entity performs better than the un-subdivided entity. Then the relationship between the entities is extracted by the bidirectional gated recurrent unit and the attention mechanism model(BiGRU-Attention).【Result/conclusion】The experimental results show that the model has a greater improvement than the BiLSTM-Attention extraction model in terms of accuracy, recall rate and F value. Finally, a visual Knowledge Graph was constructed using the Neo4 j graph database. This research transforms unstructured community texts into structured data, which promotes intelligent knowledge services, knowledge representation, and personalized knowledge recommendation in the medical community.【Innovation/limitation】In the process of medical entity recognition, entities are subdivided, and a breast cancer Knowledge Graph based on the text of online medical community question and answer is successfully constructed. However, due to the small sample size of some relationships, there is a certain impact on the evaluation indicators of the overall relationship extraction.

关 键 词:医疗问答社区 知识图谱 双向长短记忆神经网络 双向门控循环单元 深度学习 

分 类 号:G250.2[文化科学—图书馆学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象