检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:何静 陈逸然 戴田宇 HE Jing;CHEN Yiran;DAI Tianyu(Institute for Advanced Studies in Humanities and Social Sciences,Beihang University,Beijing 100191,China;School of Artificial Intelligence,Beihang University,Beijing 100191,China;School of Mathematics and Computer Sciences,Nanchang University,Nanchang 330031,China)
机构地区:[1]北京航空航天大学人文与社会科学高等研究院,北京100191 [2]北京航空航天大学人工智能学院,北京100191 [3]南昌大学数学与计算机学院,江西南昌330031
出 处:《实验技术与管理》2025年第2期96-103,共8页Experimental Technology and Management
基 金:国家自然科学基金青年项目“幻觉中的真相:多智能体交互下AIGC谣言传播模拟与治理研究”(62406016);北京市教育科学“十四五”规划课题“基于多模态数据的高校教学质量评价研究”(CGCA23128)。
摘 要:人工智能大模型各场景应用日趋广泛,但其幻觉问题导致实际应用中的误导风险增加以及用户信任下降。该研究深入探究大模型幻觉的传播与控制机制,首先运用SEIR谣言传播动力学模型,通过选取法律文本判决任务,对GPT-4-Turbo、Claude-3和Llama-3三种大模型进行了易感性的实证测试。其次,研究了干扰因素敏感度,发现内容生成后干扰对模型正确率影响更为显著,且高幻觉率模型能通过信息传递影响易感模型,导致其幻觉率上升。进一步,为完善幻觉状态干预与恢复策略,采用知识注入和提示词微调两种干预策略。最后,提出需通过更全面的管理与优化措施,来预防二次感染,并维持模型的长期健康状态。该研究不仅为大模型幻觉控制提供了新的视角,也为模型的长期稳健性管理提供了重要启示。[Objective]The application of artificial intelligence models across various scenarios is rapidly expanding.However,their hallucination problem poses risks by increasing misleading outputs and reducing user trust in practical applications.This study aims to explore the propagation and control mechanisms of hallucinations in large language models(LLMs)and assess their impact on decision-making processes.[Methods]The SEIR rumor propagation dynamics model was used to test the susceptibility of three LLMs,namely GPT-4-Turbo,Claude-3,and Llama-3,using tasks involving legal text judgments.The baseline hallucination levels of these models in legal text analysis were established to evaluate their sensitivity and explore their propagation and control mechanisms.Given the high accuracy and expert judgment required in this area,the study used 1500 real-world legal case texts and verdict results provided by the China Association of Artificial Intelligence and Legal Challenge.Through fuzzy and accurate testing,the analysis revealed the inherent hallucination tendencies of the models.Control experiments were designed to analyze the sensitivity of LLMs to disruptive text in legal analysis before and after content generation.The study also investigated how interacting with high-hallucination-rate models impacts susceptibility models,potentially increasing their hallucination rates.To mitigate this,knowledge injection and prompt fine-tuning strategies were applied,simulating recovery processes through differential equations.The change in hallucination rates after multiple intervention rounds was calculated.Furthermore,this study evaluated the possibility of secondary hallucination infections when corrected hallucination models were exposed to new information.[Results]The three effectively identified basic case information and key features but showed randomness and inaccuracies in quantifying penalties in specific legal cases.They were significantly impacted by disruptive texts,highlighting the need for stronger mechanisms to enhan
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7