恶劣场景下视觉感知与理解综述  被引量:1

Visual perception and understanding in degraded scenarios

在线阅读下载全文

作  者:汪文靖 杨文瀚 方玉明[3] 黄华[4] 刘家瑛[1] Wang Wenjing;Yang Wenhan;Fang Yuming;Huang Hua;Liu Jiaying(Wangxuan Institute of Computer Technology,Peking University,Beijing 100871,China;Department of Strategic and Advanced Interdisciplinary,PengCheng Laboratory,Shenzhen 518055,China;School of Information Management,Jiangxi University of Finance and Economics,Nanchang 330032,China;School of Artificial Intelligence,Beijing Normal University,Beijing 100875,China)

机构地区:[1]北京大学王选计算机研究所,北京100871 [2]鹏城实验室战略与交叉前沿研究部,深圳518055 [3]江西财经大学信息管理学院,南昌330032 [4]北京师范大学人工智能学院,北京100875

出  处:《中国图象图形学报》2024年第6期1667-1684,共18页Journal of Image and Graphics

基  金:国家自然科学基金项目(62332010)。

摘  要:恶劣场景下采集的图像与视频数据存在复杂的视觉降质,一方面降低视觉呈现与感知体验,另一方面也为视觉分析理解带来了很大困难。为此,系统地分析了国际国内近年恶劣场景下视觉感知与理解领域的重要研究进展,包括图像视频与降质建模、恶劣场景视觉增强、恶劣场景下视觉分析理解等技术。其中,视觉数据与降质建模部分探讨了不同降质场景下的图像视频与降质过程建模方法,涵盖噪声建模、降采样建模、光照建模和雨雾建模。传统恶劣场景视觉增强部分探讨了早期非深度学习的视觉增强算法,包括直方图均衡化、视网膜大脑皮层理论和滤波方法等。基于深度学习模型的恶劣场景视觉增强部分则以模型架构创新的角度进行梳理,探讨了卷积神经网络、Transformer模型和扩散模型等架构。不同于传统视觉增强的目标为全面提升人眼对图像视频的视觉感知效果,新一代视觉增强及分析方法考虑降质场景下机器视觉对图像视频的理解性能。恶劣场景下视觉理解技术部分探讨了恶劣场景下视觉理解数据集和基于深度学习模型的恶劣场景视觉理解,以及恶劣场景下视觉增强与理解协同计算。论文详细综述了上述研究的挑战性,梳理了国内外技术发展脉络和前沿动态。最后,根据上述分析展望了恶劣场景下视觉感知与理解的发展方向。Visual media such as images and videos are crucial means for humans to acquire,express,and convey informa⁃tion.The widespread application of foundational technologies like artificial intelligence and big data has facilitated thegradual integration of systems for the perception and understanding of images and videos into all aspects of production anddaily life.However,the emergence of massive applications also brings challenges.Specifically,in open environments,various applications generate vast amounts of heterogeneous data,which leads to complex visual degradation in images andvideos.For instance,adverse weather conditions like heavy fog can reduce visibility,which results in the loss of details.Data captured in rainy or snowy weather can exhibit deformations in objects or individuals due to raindrops,which result in structured noise.Low-light conditions can cause severe loss of details and structured information in images.Visual degra⁃dation not only diminishes the visual presentation and perceptual experience of images and videos but also significantlyaffects the usability and effectiveness of existing visual analysis and understanding systems.In today’s era of intelligenceand information technology,with explosive growth in visual media data,especially in challenging scenarios,visual percep⁃tion and understanding technologies hold significant scientific significance and practical value.Traditional visual enhance⁃ment techniques can be divided into two methods:spatial domain-based and frequency domain-based.Spatial domainmethods directly process 2D spatial data,including grayscale transformation,histogram transformation,and spatial domainfiltering.Frequency domain methods transform data into the frequency domain through models,like Fourier transform,forprocessing and then restore it to the spatial domain.The development of computer vision technology has facilitated theemergence of more well-designed and robust visual enhancement algorithms,such as dehazing algorithms based on darkchannel priors.Since

关 键 词:恶劣场景 视觉感知 视觉理解 图像视频增强 图像视频处理 深度学习 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象