检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Rui LIU Yahong HAN
机构地区:[1]College of Intelligence and Computing,Tianjin University,Tianjin,300350,China [2]Tianjin Key Lab of Machine Learning,Tianjin University,Tianjin,300350,China
出 处:《Frontiers of Computer Science》2022年第6期93-101,共9页中国计算机科学前沿(英文版)
基 金:supported by the National Natural Science Foundation of China (Grant Nos.61876130,61932009).
摘 要:Video question answering(Video QA)involves a thorough understanding of video content and question language,as well as the grounding of the textual semantic to the visual content of videos.Thus,to answer the questions more accurately,not only the semantic entity should be associated with certain visual instance in video frames,but also the action or event in the question should be localized to a corresponding temporal slot.It turns out to be a more challenging task that requires the ability of conducting reasoning with correlations between instances along temporal frames.In this paper,we propose an instance-sequence reasoning network for video question answering with instance grounding and temporal localization.In our model,both visual instances and textual representations are firstly embedded into graph nodes,which benefits the integration of intra-and inter-modality.Then,we propose graph causal convolution(GCC)on graph-structured sequence with a large receptive field to capture more causal connections,which is vital for visual grounding and instance-sequence reasoning.Finally,we evaluate our model on TVQA+dataset,which contains the groundtruth of instance grounding and temporal localization,three other Video QA datasets and three multimodal language processing datasets.Extensive experiments demonstrate the effectiveness and generalization of the proposed method.Specifically,our method outperforms the state-of-the-art methods on these benchmarks.
关 键 词:video question answering instance grounding graph causal convolution
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.221.133.22