ABC-Net: Semi-Supervised Multimodal GAN-based Engagement Detection using an Affective, Behavioral and Cognitive Model
We present ABC-Net, a novel semi-supervised multimodal GAN framework to detect engagement levels in video conversations based on psychology literature. We use three constructs: behavioral, cognitive, and affective engagement, to extract various features that can effectively capture engagement levels. We feed these features to our semi-supervised GAN network that does regression using these latent representations to obtain the corresponding valence and arousal values, which are then categorized into different levels of engagements. We demonstrate the efficiency of our network through experiments on the RECOLA database. To evaluate our method, we analyze and compare our performance on RECOLA and report a relative performance improvement of more than 5 is the first method to classify engagement based on a multimodal semi-supervised network.
READ FULL TEXT