检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:向巴卓玛 王珍珍[1] 赵岩松[1] 马勤 倪磊[1] 马星光[1] XIANGBA Zhuoma;WANG Zhenzhen;ZHAO Yansong;MA Qin;NI Lei;MA Xingguang(School of Management,Beijing University of Chinese Medicine,Beijing 102488,China;China Traditional Chinese Medicine Press,Beijing 100176,China)
机构地区:[1]北京中医药大学,北京102488 [2]中国中医药出版社,北京100176
出 处:《中医教育》2025年第1期137-142,共6页Education of Chinese Medicine
基 金:北京中医药大学哲学社会科学培育基金项目(No.2024-JYB-PY-006);北京中医药大学教育科学研究课题(No.XJY22048)。
摘 要:目的 评估不同大型语言模型在中医执业医师资格考试中的应用表现。方法 选用中医执业医师资格考试题库中的不同学科对文心一言4.0、ChatGPT4.0、百川大模型3.0、Claude3-Sonnet、智谱清言4.0共5种大型语言模型进行答题准确率测试。结果 文心一言4.0和百川大模型3.0在中医不同学科上的总准确率最高,而智谱清言4.0的总准确率最低。从不同中医学科目上比较,5种模型在中医内科学和中药学上准确率较高,但在方剂学和中医经典等需要理解中医古文典籍或应用能力方面的科目上,模型准确率较低,且各模型之间存在差异。结论 不同模型的表现差异表明,模型的表现受训练数据的内容、质量及模型自身逻辑推理能力等多方面因素的影响。随着人工智能技术的不断迭代发展,将模型作为教学辅助工具,有望推动教育领域的变革。通过加强模型在特定专业领域的训练,可以进一步提升模型对相关专业术语的理解和应用能力,更好地满足教学领域的实际需求,进而提升教学质量和学习效率。objective This study aims to evaluate the performance of different large language models in the qualification examination for traditional Chinese medicine(TCM)practitioners.Methods Five large language models—ERNIE Bot 4.0,ChatGPT 4.0,Baichuan Big Model 3.0,Claude3-Sonnet,and ChatGLM 4.0—were tested for accuracy in answering questions from different disciplines within the TCM practitioner qualification exam question bank.Results ERNIE Bot 4.0 and Baichuan Big Model 3.0 achieved the highest overall accuracy rates across various TCM disciplines,with ERNIE Bot 4.0 exceeding an 80%accuracy rate.ChatGLM 4.0 showed the lowest overall accuracy.The models performed better in disciplines such as TCM Internal Medicine and Traditional Chinese Pharmacology,but their accuracy dropped significantly in Formula Science and TCM Classics,which require understanding ancient Chinese medical texts and advanced application skills.Differences in performance between the models were also observed.Conclusion The performance variations among the models indicate that factors such as the content and quality of training data,as well as the models'logical reasoning abilities,play a significant role in their effectiveness.As artificial intelligence continues to evolve,large language models are expected to become valuable teaching aids,potentially transforming education.Enhancing model training in specialized fields could improve their understanding and application of professional terminology,thereby better addressing educational needs and improving both teaching quality and learning efficiency.
关 键 词:人工智能 大型语言模型 中医执业医师资格考试 模型评价
分 类 号:G642.474[文化科学—高等教育学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.43