Evaluating large language models as patient education tools for inflammatory bowel disease:A comparative study  

在线阅读下载全文

作  者:Yan Zhang Xiao-Han Wan Qing-Zhou Kong Han Liu Jun Liu Jing Guo Xiao-Yun Yang Xiu-Li Zuo Yan-Qing Li 

机构地区:[1]Department of Gastroenterology,Qilu Hospital of Shandong University,Jinan 250012,Shandong Province,China [2]Laboratory of Translational Gastroenterology,Qilu Hospital of Shandong University,Jinan 250012,Shandong Province,China [3]Robot Engineering Laboratory for Precise Diagnosis and Therapy of GI Tumor,Qilu Hospital of Shandong University,Jinan 250012,Shandong Province,China [4]Shandong Provincial Clinical Research Center for Digestive Disease,Qilu Hospital of Shandong University,Jinan 250012,Shandong Province,China

出  处:《World Journal of Gastroenterology》2025年第6期34-43,共10页世界胃肠病学杂志(英文)

基  金:Supported by the China Health Promotion Foundation Young Doctors'Research Foundation for Inflammatory Bowel Disease,the Taishan Scholars Program of Shandong Province,China,No.tsqn202306343;National Natural Science Foundation of China,No.82270578.

摘  要:BACKGROUND Inflammatory bowel disease(IBD)is a global health burden that affects millions of individuals worldwide,necessitating extensive patient education.Large language models(LLMs)hold promise for addressing patient information needs.However,LLM use to deliver accurate and comprehensible IBD-related medical information has yet to be thoroughly investigated.AIM To assess the utility of three LLMs(ChatGPT-4.0,Claude-3-Opus,and Gemini-1.5-Pro)as a reference point for patients with IBD.METHODS In this comparative study,two gastroenterology experts generated 15 IBD-related questions that reflected common patient concerns.These questions were used to evaluate the performance of the three LLMs.The answers provided by each model were independently assessed by three IBD-related medical experts using a Likert scale focusing on accuracy,comprehensibility,and correlation.Simultaneously,three patients were invited to evaluate the comprehensibility of their answers.Finally,a readability assessment was performed.RESULTS Overall,each of the LLMs achieved satisfactory levels of accuracy,comprehensibility,and completeness when answering IBD-related questions,although their performance varies.All of the investigated models demonstrated strengths in providing basic disease information such as IBD definition as well as its common symptoms and diagnostic methods.Nevertheless,when dealing with more complex medical advice,such as medication side effects,dietary adjustments,and complication risks,the quality of answers was inconsistent between the LLMs.Notably,Claude-3-Opus generated answers with better readability than the other two models.CONCLUSION LLMs have the potential as educational tools for patients with IBD;however,there are discrepancies between the models.Further optimization and the development of specialized models are necessary to ensure the accuracy and safety of the information provided.

关 键 词:Inflammatory bowel disease Large language models Patient education Medical information accuracy Readability assessment 

分 类 号:R574[医药卫生—消化系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象