检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:齐思洋 胡慧云 李洪冰 李琦[1] 肖波[1] QI Siyang;HU Huiyun;LI Hongbing;LI Qi;XIAO Bo(School of Artificial Intelligence,Beijing University of Posts and Telecommunications,Beijing 100876,China)
出 处:《北京邮电大学学报》2024年第4期50-56,共7页Journal of Beijing University of Posts and Telecommunications
基 金:国家自然科学基金项目(62076031);北京邮电大学研究生创新创业项目资助(2024-YC-T028)。
摘 要:针对构建领域问答系统时所面临的数据成本高、知识构建复杂和不同领域数据集差异大等挑战,提出了一种融合大语言模型和领域知识的问答系统构建方法。现有方法多是直接将本地知识语料分段存储匹配,在进行检索增强生成时,查询文本与分段内容语义匹配度不高,从而降低文本生成质量。为此,提出基于提示工程的查询语义对齐优化方法,通过生成“假设性问答对”来统一用户查询和语料的语义空间,从而提高领域知识的检索效率和答案的准确性。此外,实验证明,所提方法能够克服模型训练成本高的问题,迅速构建部署到不同垂直领域,并在性能上优于其他方法。The construction of domain-specific question answering system frequently encounters challenges,including substantial data costs,intricate knowledge construction,and the significant differences among datasets from various domains.To address these challenges,an approach that integrates large language models and domain specific knowledge for question answering system construction is proposed.Most of the existing methods directly store and match local knowledge corpus in segments.When performing retrieval-augmented generation,the semantic matching between the query and the corpus is insufficient,thus reducing the quality of text generation.Therefore,the prompt aligned retrieval generation approach is proposed to unify the semantic space of user queries and corpus by generating pseudo question and answer pairs,thereby improving the retrieval efficiency of domain knowledge and the accuracy of answers.Experiments show that the proposed approach overcomes challenges related to high model training costs,enabling rapid deployment across various vertical domains and outperforming other methods.
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.134.110.4