Depressive semantic awareness from vlog facial and vocal streams via spatio-temporal transformer  

在线阅读下载全文

作  者:Yongfeng Tao Minqiang Yang Yushan Wu Kevin Lee Adrienne Kline Bin Hu 

机构地区:[1]School of Information Science and Engineering,Lanzhou University,Lanzhou,China [2]The School of Accounting,Auditing and Taxation,Business School,UNSW Sydney,Australia [3]Department of Preventive Medicine,Northwestern University,Chicago,IL,United States

出  处:《Digital Communications and Networks》2024年第3期577-585,共9页数字通信与网络(英文版)

基  金:supported in part by the STI 2030-Major Projects(2021ZD0202002);in part by the National Natural Science Foundation of China(Grant No.62227807);in part by the Natural Science Foundation of Gansu Province,China(Grant No.22JR5RA488);in part by the Fundamental Research Funds for the Central Universities(Grant No.lzujbky-2023-16);Supported by Supercomputing Center of Lanzhou University.

摘  要:With the rapid growth of information transmission via the Internet,efforts have been made to reduce network load to promote efficiency.One such application is semantic computing,which can extract and process semantic communication.Social media has enabled users to share their current emotions,opinions,and life events through their mobile devices.Notably,people suffering from mental health problems are more willing to share their feelings on social networks.Therefore,it is necessary to extract semantic information from social media(vlog data)to identify abnormal emotional states to facilitate early identification and intervention.Most studies do not consider spatio-temporal information when fusing multimodal information to identify abnormal emotional states such as depression.To solve this problem,this paper proposes a spatio-temporal squeeze transformer method for the extraction of semantic features of depression.First,a module with spatio-temporal data is embedded into the transformer encoder,which is utilized to obtain a representation of spatio-temporal features.Second,a classifier with a voting mechanism is designed to encourage the model to classify depression and non-depression effec-tively.Experiments are conducted on the D-Vlog dataset.The results show that the method is effective,and the accuracy rate can reach 70.70%.This work provides scaffolding for future work in the detection of affect recognition in semantic communication based on social media vlog data.

关 键 词:Emotional computing Semantic awareness Depression recognition Vlog data 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象