视觉情境感知驱动的虚拟机器人交互系统被引量：1

Design of Visual Context-driven Interactive Bot System

作　　者：刘宇博郭斌[1] 马可邱晨刘思聪 LIU Yubo;GUO Bin;MA Ke;QIU Chen;LIU Sicong(School of Computer Science,Northwestern Polytechnical University,Xi’an 710129,China)

机构地区：[1]西北工业大学计算机学院,西安710129

出　　处：《计算机科学》2023年第9期260-268,共9页Computer Science

基　　金：国家杰出青年科学基金(62025205);国家自然科学基金(62032020,61725205,62102317)。

摘　　要：虚拟机器人是能与人交互的智能软件,通常具有实时性、交互性等特点。文中以视觉情境感知驱动的虚拟机器人为主题,从轻量级目标检测模型及压缩、实时关键帧提取、系统优化和交互策略4个方面展开探究,在边缘的资源受限平台上构建强实时性、高交互性、高度可扩展的虚拟机器人系统。具体而言,在轻量级目标检测模型及压缩方面,首先探究不同主干网络下SSD模型的性能与精度,随后对基于VGG16网络的SSD模型进行int8量化与剪枝,在精度损失不超过0.1%的前提下,帧率比原模型提高187%。在实时关键帧提取方面,使用边缘特征强度和HOG特征进行视频流预筛选,降低系统压力,等效减少90%的推理时延。在系统优化方面,采用微服务化降低冷启动时延约98%。在交互策略方面,使用含计时器的状态机对情境进行建模以实现情境驱动,并采用语音形式完成人机交互的输出。Bots are intelligent software that can interact with people,and usually have the characteristics of real-time and interactivity.This paper takes the bots driven by visual context awareness as the theme,and explores from four aspects:lightweight target detection model and compression,real-time key frame extraction,system optimization,and interaction strategy,and builds strong real-time on edge resource-constrained devices.A flexible,highly interactive and highly scalable bots system.Specifically,in terms of lightweight target detection models and compression,we first explore the performance and accuracy of different lightweight target detection models,and compress the SSD model based on the VGG16 network to find a suitable compression strategy.Compression on the latest SSD model can increase the frame rate by 187%compared with the original model,under the pre-mise that the accuracy loss does not exceed 0.1%.In terms of real-time key frame extraction,the input video stream is pre-screened to reduce system pressure,which is equivalent to reducing inference delay by 90%.In terms of system optimization,the use of microservices reduces the cold start delay by about 98%.In terms of interaction strategy,a state machine with timer is used to model the situation to achieve situation-driven,and the output of human-computer interaction is completed in the form of speech.

关键词：资源受限轻量级模型模型压缩目标检测情境驱动

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

视觉情境感知驱动的虚拟机器人交互系统被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

视觉情境感知驱动的虚拟机器人交互系统 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

视觉情境感知驱动的虚拟机器人交互系统被引量：1