检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杨茸宇 刘凤丽[1,2] 郝永平[2] YANG Rongyu;LIU Fengli;HAO Yongping(School of Mechanical Engineering,Shenyang University of Technology,Shenyang 110159,China;Liaoning Provincial Key Laboratory of Advanced Manufacturing Technology and Equipment,Shenyang University of Technology,Shenyang 110159,China)
机构地区:[1]沈阳理工大学机械工程学院,辽宁沈阳110159 [2]沈阳理工大学辽宁省先进制造技术与装备重点实验室,辽宁沈阳110159
出 处:《探测与控制学报》2024年第5期71-79,共9页Journal of Detection & Control
基 金:装备预研重点实验室基金项目(2021JCJQLB055009)。
摘 要:针对现代战场单一探测手段的局限性和单模态目标识别存在信息不全面、易受噪声干扰等缺点,提出一种融合声光两种模态的目标识别方法。该方法利用深度卷积残差网络对声纹信息的对数梅尔频谱系数特征进行提取,使用YOLOX-S网络对目标进行光学特征提取,并计算目标的像空间位置与类别信息,然后在YOLOX-S模型预测部分的解耦头中引入用于处理声音特征的支路,将目标的光学特性与声学特性在YOLOX-S检测头分类支路上进行空间归一化,使视觉数据与声纹数据在同一可拼接域上进行映射与融合,对目标的声光融合特征进行识别推理。在自建数据集上进行验证,实验结果表明声纹信息和图像信息融合可以提供更全面的感知能力,使得目标的检测和识别更加准确和可靠。In view of the limitations of single detection methods in modern battlefield and the shortcomings of single mode target recognition such as incomplete information and easy to be disturbed by noise,a new target recognition method combining two modes of sound and light was proposed.In this method,the log-mel spectral coefficient features of voiceprint information were extracted by deep convolutional residual network,the optical features of the target were extracted by YOLOX-S network,and the image space position and category information of the target were calculated.Then,a branch for processing sound features was introduced into the decoupling head of the prediction part of the YOLOX-S model.The optical and acoustic characteristics of the target were spatially normalized on the classification branch of the YOLOX-S detection head,so that the visual data and voicing data could be mapped and fused in the same concatenable domain,and the acousto-optical fusion features of the target could be identified and reasoned.The experimental results showed that the fusion of voiceprint information and image information could provide a more comprehensive perception capability and make the detection and recognition of objects more accurate and reliable.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.33