记忆成像

Engram-Driven Videography

作　　者：方璐季梦奇袁肖赟贺敬张嘉凝朱胤恒郑添刘乐遥王滨戴琼海 Lu Fang;Mengqi Ji;Xiaoyun Yuan;Jing He;Jianing Zhang;Yinheng Zhu;Tian Zheng;Leyao Liu;Bin Wang;Qionghai Dai(Department of Electronic Engineering,Tsinghua University,Beijing 100084,China;Beijing National Research Center for Information Science and Technology,Tsinghua University,Beijing 100084,China;Tsinghua–Berkeley Shenzhen Institute,Shenzhen 518071,China;Institute for Brain and Cognitive Science,Tsinghua University,Beijing 100084,China;Department of Automation,Tsinghua University,Beijing 100084,China;Beijing Laboratory of Brain and Cognitive Intelligence,Beijing Municipal Education Commission,Beijing 100010,China;Hangzhou Hikvision Digital Technology Co.,Ltd.,Hangzhou 310012,China)

机构地区：[1]Department of Electronic Engineering,Tsinghua University,Beijing 100084,China [2]Beijing National Research Center for Information Science and Technology,Tsinghua University,Beijing 100084,China [3]Tsinghua–Berkeley Shenzhen Institute,Shenzhen 518071,China [4]Institute for Brain and Cognitive Science,Tsinghua University,Beijing 100084,China [5]Department of Automation,Tsinghua University,Beijing 100084,China [6]Beijing Laboratory of Brain and Cognitive Intelligence,Beijing Municipal Education Commission,Beijing 100010,China [7]Hangzhou Hikvision Digital Technology Co.,Ltd.,Hangzhou 310012,China

出　　处：《Engineering》2023年第6期101-109,M0005,共10页工程（英文）

基　　金：the Shuimu Tsinghua Scholar Program;Project funded by National Natural Science Foundation of China(62125106,61860206003,and 62088102);in part by Shenzhen Science and Technology Research and Development Funds(JCYJ20180507183706645);in part by Ministry of Science and Technology of China(2021ZD0109901);in part by Beijing National Research Center for Information Science and Technology(BNR2020RC01002);China Postdoctoral Science Foundation(2020TQ0172,2020M670338,and YJ20200109);Postdoctoral International Exchange Program(YJ20210124)。

摘　　要：感知和理解大规模动态场景需要高性能的成像系统。传统的成像系统通过简单地通过拼接相机提高像素分辨率来追求更高的性能,而牺牲了庞大的系统。此外,它们严格遵循前馈路径,即它们的像素级感知独立于语义理解。不同的是,人类视觉系统在前馈和反馈两种通路上都具有优势:前馈通路从视觉输入中提取物体表征(称为记忆印痕),而在反馈通路中,相关的印痕被重新激活以产生关于物体的假设。受此启发,我们提出了一种双通道成像机制,称为刻痕驱动摄像。我们从抽象场景的整体表示开始,它与本地细节双向关联,由实例级印痕驱动。从技术上讲,整个系统的工作原理是在兴奋-抑制和联想状态之间交替进行。在前一种状态下,像素级细节被动态整合或抑制,以加强实例级印记。在关联状态下,空间和时间上一致的内容在其印痕的驱动下被合成,以获得未来场景出色的录像质量。联想状态通过综合由其印痕驱动的空间和时间上一致的内容,作为未来场景的成像。大量的仿真和实验结果表明,该系统彻底改变了传统的录像模式,在多目标大场景的录像中显示出巨大的潜力。Sensing and understanding large-scale dynamic scenes require a high-performance imaging system.Conventional imaging systems pursue higher capability by simply increasing the pixel resolution via stitching cameras at the expense of a bulky system.Moreover,they strictly follow the feedforward pathway:That is,their pixel-level sensing is independent of semantic understanding.Differently,a human visual system owns superiority with both feedforward and feedback pathways:The feedforward pathway extracts object representation(referred to as memory engram)from visual inputs,while,in the feedback pathway,the associated engram is reactivated to generate hypotheses about an object.Inspired by this,we propose a dual-pathway imaging mechanism,called engram-driven videography.We start by abstracting the holistic representation of the scene,which is associated bidirectionally with local details,driven by an instance-level engram.Technically,the entire system works by alternating between the excitation–inhibition and association states.In the former state,pixel-level details become dynamically consolidated or inhibited to strengthen the instance-level engram.In the association state,the spatially and temporally consistent content becomes synthesized driven by its engram for outstanding videography quality of future scenes.The association state serves as the imaging of future scenes by synthesizing spatially and temporally consistent content driven by its engram.Results of extensive simulations and experiments demonstrate that the proposed system revolutionizes the conventional videography paradigm and shows great potential for videography of large-scale scenes with multi-objects.

关键词：人类视觉系统像素分辨率成像系统像素级双向关联动态场景前馈成像机制

分类号：TP391.41[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

记忆成像

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

记忆成像

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索