Data Augmentation for Human Behavior Analysis in Multi-Person Conversations

08/03/2023
by   Kun Li, et al.
0

In this paper, we present the solution of our team HFUT-VUT for the MultiMediate Grand Challenge 2023 at ACM Multimedia 2023. The solution covers three sub-challenges: bodily behavior recognition, eye contact detection, and next speaker prediction. We select Swin Transformer as the baseline and exploit data augmentation strategies to address the above three tasks. Specifically, we crop the raw video to remove the noise from other parts. At the same time, we utilize data augmentation to improve the generalization of the model. As a result, our solution achieves the best results of 0.6262 for bodily behavior recognition in terms of mean average precision and the accuracy of 0.7771 for eye contact detection on the corresponding test set. In addition, our approach also achieves comparable results of 0.5281 for the next speaker prediction in terms of unweighted average recall.

READ FULL TEXT

page 1

page 2

research
07/12/2020

Data augmentation enhanced speaker enrollment for text-dependent speaker verification

Data augmentation is commonly used for generating additional data from t...
research
10/20/2020

Tongji University Undergraduate Team for the VoxCeleb Speaker Recognition Challenge2020

In this report, we discribe the submission of Tongji University undergra...
research
10/16/2020

Tongji University Team for the VoxCeleb Speaker Recognition Challenge 2020

In this report, we describe the submission of Tongji University team to ...
research
12/14/2021

ImportantAug: a data augmentation agent for speech

We introduce ImportantAug, a technique to augment training data for spee...
research
06/28/2022

The Third Place Solution for CVPR2022 AVA Accessibility Vision and Autonomy Challenge

The goal of AVA challenge is to provide vision-based benchmarks and meth...
research
03/05/2020

Augmented Transformer Achieves 97 and Classical Retro-Synthesis

We investigated the effect of different augmentation scenarios on predic...
research
07/05/2019

The DKU Replay Detection System for the ASVspoof 2019 Challenge: On Data Augmentation, Feature Representation, Classification, and Fusion

This paper describes our DKU replay detection system for the ASVspoof 20...

Please sign up or login with your details

Forgot password? Click here to reset