Fish Feeding Behavior Recognition Using Adaptive DMCA-UMT Algorithm  

在线阅读下载全文

作  者:Caiwei Yang Xinting Yang Kaijie Zhu Chao Zhou 

机构地区:[1]National Engineering Research Center for Information Technology in Agriculture,Beijing 100097,China [2]Information Technology Research Center,Beijing Academy of Agriculture and Forestry Sciences,Beijing 100097,China [3]National Engineering Laboratory for Agri-product Quality Traceability,Beijing 100097,China [4]College of Computer and Information Engineering,Tianjin Agricultural University,Tianjin 300384,China

出  处:《Journal of Beijing Institute of Technology》2023年第3期285-297,共13页北京理工大学学报(英文版)

基  金:supported by the Beijing Natural Science Foundation(No.6212007);the National Key Technology R&D Program of China(No.2022YFD2001701);the Youth Research Fund of Beijing Academy of Agricultural and Forestry Sciences(No.QNJJ202014)。

摘  要:Realtime analyzing the feeding behavior of fish is the premise and key to accurate guidance on feeding.The identification of fish behavior using a single information is susceptible to various factors.To overcome the problems,this paper proposes an adaptive deep modular co-attention unified multi-modal transformers(DMCA-UMT).By fusing the video,audio and water quality parameters,the whole process of fish feeding behavior could be identified.Firstly,for the input video,audio and water quality parameter information,features are extracted to obtain feature vectors of different modalities.Secondly,deep modular co-attention(DMCA)is introduced on the basis of the original cross-modal encoder,and the adaptive learnable weights are added.The feature vector of video and audio joint representation is obtained by automatic learning based on fusion contribution.Finally,the information of visual-audio modality fusion and text features are used to generate clip-level moment queries.The query decoder decodes the input features and uses the prediction head to obtain the final joint moment retrieval,which is the start-end time of feeding the fish.The results show that the mAP Avg of the proposed algorithm reaches 75.3%,which is37.8%higher than that of unified multi-modal transformers(UMT)algorithm.

关 键 词:AQUACULTURE multi-modal fusion deep modular co-attention(DMCA) unified multimodal transformers(UMT) video moment retrieval 

分 类 号:S951.2[农业科学—水产养殖] TP391.41[农业科学—水产科学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象