中医古籍方剂数据挖掘与知识问答系统构建  

Data Mining and Knowledge Q&A System for Traditional Chinese Medicine Formulas in Ancient Books

在线阅读下载全文

作  者:李明 罗晓兰[2] 朱邦贤[3] LI Ming;LUO Xiaolan;ZHU Bangxian

机构地区:[1]上海中医药大学科技人文研究院 [2]上海中医药大学图书馆 [3]上海中医药大学

出  处:《图书馆论坛》2025年第4期49-59,共11页Library Tribune

基  金:国家社科基金重大项目“中医药基本名词术语挖掘、整理及翻译标准化研究”(项目编号:19ZDA301)研究成果。

摘  要:文章以《伤寒论》等中医古籍为数据来源,结合古籍的目录结构,通过ChatGLM提取古籍中的方剂信息,存入MySQL关系型数据库,构建中医古籍方剂检索系统;通过ChatGLM对方剂信息进行解析,利用Apriori、association_rules、community_louvain等数据挖掘算法,以及Echart、Pyvis等知识图谱工具,实现中医古籍挖掘及知识图谱可视化展示,构建基于LLM的中医古籍方剂数据挖掘系统;以BISHENG平台为工具,以抽取方剂信息为来源,构建基于检索增强生成的中医古籍方剂知识问答系统。研究结果表明:本研究方法提取中医古籍方剂名的召回率为99.19%-100%;除《医学衷中参西录·方剂篇》外,方剂组成、主治、用法抽取准确性的ROUGE-L值为84.29%-97.78%;中药名和剂量识别的准确率大于98.00%,主治解析准确率大于86.00%;数据挖掘结果与已有古籍研究成果相符;知识问答结果符合预期。Taking the Treatise on Febrile Diseases and other ancient Chinese medical books as data sources,and according to their catalog structure,this paper creates a retrieval system for the formulas(or,prescriptions)recorded in ancient books of Traditional Chinese Medicine(TCM)by extracting formula information from these books through ChatGLM and storing it in a MySQL database.By parsing the prescription information through ChatGLM,and applying data mining algorithms such as Apriori,association_rules,community_louvain and other knowledge graph tools of Echart and Pyvis,it implements data mining on the ancient books and visualizes the knowledge graphs,thus establishing an LLM-based data mining system of TCM formulas.Based on the extracted formula information,it develops a knowledge Q&A system for TCM formulas on the BISHENG platform by using the retrieval-augmented generation algorithm.The results show that the recall rate for extracting formula names from ancient TCM books in this study ranges from 99.19%to 100%;the ROUGE-L values for the accuracy of formula composition,main functions,and usage are between 84.29%and 97.78%,except for Integrating Chinese and Western Medicine·Formulas;the accuracy of Chinese medicine name and dosage identification is greater than 98.00%,and the accuracy of main function parsing is greater than 86.00%.The data mining results are consistent with those of existing research;and the performance of knowledge Q&A is as expected.

关 键 词:大语言模型 检索增强生成 数据挖掘 知识问答 中医古籍 

分 类 号:R2-03[医药卫生—中医学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象