中医药大语言模型的关键技术与构建策略  被引量:3

Key technologies and construction strategies of large language models for traditional Chinese medicine

在线阅读下载全文

作  者:萧文科 宋驰 陈士林 陈伟[1,2] XIAO Wenke;SONG Chi;CHEN Shilin;CHEN Wei(Innovative Institute of Chinese Medicine and Pharmacy/Academy for Interdiscipline,Chengdu University of Traditional Chinese Medicine,Chengdu 611137,China;Institute of Herbgenomics,Chengdu University of Traditional Chinese Medicine,Chengdu 611137,China)

机构地区:[1]成都中医药大学中医药创新研究院/交叉学科研究院,四川成都611137 [2]成都中医药大学本草基因组学研究院,四川成都611137

出  处:《中草药》2024年第17期5747-5756,共10页Chinese Traditional and Herbal Drugs

基  金:成都中医药大学引进人才项目(030041225)。

摘  要:大语言模型(large language model,LLM)通过处理和理解自然语言数据,实现高质量的信息检索、知识提取等功能,为中医药研究提供了新机遇。基于中医药大模型发展现状,梳理了LLM开发过程中的数据存储与处理方法,概述了检索增强生成、混合专家模型、人类反馈强化学习、知识蒸馏等人工智能方法,归纳了LLM训练微调与性能评价方法。针对中医药数据的特点,从高质量数据集构建、多领域专家系统融合、信息快速提取、训练与调优等方面入手,提出了中医药LLM的构建策略,并分析了LLM在中医药领域的具体应用场景,为中医药领域LLM的构建和应用提供参考,推动中医药现代化和智能化发展。By processing and understanding natural language data,large language models(LLM)enable the high-quality information retrieval,knowledge extraction,etc.,and provide new opportunities for traditional Chinese medicine(TCM)research.Based on recent developments of LLM in TCM,the present work summarizes the data storage and processing algorithms,as well as artificial intelligence methods,such as retrieval-augmented generation,mixture of experts,reinforcement learning from human feedback,and knowledge distillation for developing LLM.It also summarizes methods for training fine-tuning and performance evaluation of LLM.In response to the characteristics of TCM data,strategies for developing LLM for TCM are proposed,which focuses on developing high-quality datasets,integrating mixture of experts,rapid information extraction,and model training and optimization.Additionally,it outlines specific application scenarios of LLM in TCM.The aim of this work is to provide insights for the development and application of LLM in TCM,promoting the modernization and intelligent development of TCM.

关 键 词:中医药 大语言模型 混合专家系统 检索增强生成 人类反馈强化学习 知识蒸馏 

分 类 号:R28[医药卫生—中药学] TP18[医药卫生—中医学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象