基于自然语言处理的易水学派文本挖掘与句法分析图谱构建研究  

Construction of a syntactic analysis map for Yishui school through text mining and natural language processing research

在线阅读下载全文

作  者:赵汉青 李玥函 邹欣妍 ZHAO Hanqing;LI Yuehan;ZOU Xinyan(College of Traditional Chinese Medicine,Hebei University,Baoding 071000,China)

机构地区:[1]河北大学中医学院,河北保定071000

出  处:《医学研究与教育》2024年第4期30-37,共8页Medical Research and Education

基  金:国家自然科学基金(82004503);河北省高等学校科学技术研究项目资助(BJK2024108)。

摘  要:自然语言处理中,实体与关系抽取是构建知识图谱、设计问答系统、语义分析等任务中不可或缺的环节。中医易水学派的信息多数以非结构化文言文本形式储存,中医文本关键信息抽取对挖掘和研究中医学术流派有重要作用。为了更高效地解决以上问题,研究引入人工智能方法,构建自然语言处理技术架构下基于条件随机场的分词和实体关系抽取模型识别与抽取中医文本实体关系,利用词频-逆文档频率算法的常用加权技术提取不同古籍文本中的关键实体信息,并使用基于人工神经网络依存句法分析技术,深入剖析古籍条文,以揭示其中实体之间复杂而精确的语法关系,将其表示为可视化树形结构,为下一步构建易水学派知识图谱及利用人工智能方法开展中医学术流派研究奠定基础。Entity and relationship extraction is a crucial component in natural language processing tasks such as knowledge graph construction,question answering system design,and semantic analysis.The information pertaining to Yishui school of traditional Chinese medicine primarily exists in the form of unstructured classical Chinese text,making key information extraction from TCM texts essential for mining and studying TCM academic schools.To efficiently address these challenges using artificial intelligence methods,this paper presents a word segmentation and entity relationship extraction model based on conditional random field within the framework of natural language processing technology to identify and extract entity relationships from TCM texts.Important key entity information from different ancient books is extracted using commonly employed TF-IDF information retrieval and data mining weighting techniques.Additionally,grammatical relationships between entities in each ancient book article are analyzed using a neural network dependency parsing analyzer,which are then represented as tree structures for visualization purposes.This paper lays the foundation for subsequent steps involving building a knowledge graph for Yishui school and utilizing artificial intelligence methods to conduct research on TCM academic schools.

关 键 词:自然语言处理 知识图谱 易水学派 句法分析 

分 类 号:R2[医药卫生—中医学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象