检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:CUI Wencheng SHI Wentao SHAO Hong 崔文成
出 处:《High Technology Letters》2025年第1期73-85,共13页高技术通讯(英文版)
基 金:Supported by the National Natural Science Foundation of China(No.62001313);the Liaoning Professional Talent Protect(No.XLYC2203046);the Shenyang Municipal Medical Engineering Cross Research Foundation of China(No.22-321-32-09).
摘 要:Medical visual question answering(MedVQA)aims to enhance diagnostic confidence and deepen patientsunderstanding of their health conditions.While the Transformer architecture is widely used in multimodal fields,its application in MedVQA requires further enhancement.A critical limitation of contemporary MedVQA systems lies in the inability to integrate lifelong knowledge with specific patient data to generate human-like responses.Existing Transformer-based MedVQA models require enhancing their capabitities for interpreting answers through the applications of medical image knowledge.The introduction of the medical knowledge graph visual language transformer(MKGViLT),designed for joint medical knowledge graphs(KGs),addresses this challenge.MKGViLT incorporates an enhanced Transformer structure to effectively extract features and combine modalities for MedVQA tasks.The MKGViLT model delivers answers based on richer background knowledge,thereby enhancing performance.The efficacy of MKGViLT is evaluated using the SLAKE and P-VQA datasets.Experimental results show that MKGViLT surpasses the most advanced methods on the SLAKE dataset.
关 键 词:knowledge graph(KG) medical vision question answer(MedVQA) vision-andlanguage transformer
分 类 号:TP3[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49