Upper, Middle and Lower Region Learning for Facial Action Unit Detection

02/10/2020
by   Yao Xia, et al.
0

Facial action units (AUs) detection is fundamental to facial expression analysis. As AU occur only in a small area of face, region based learning has been widely recognized useful for AU detection. Most region based studies focus on a small region where the AU occurs. Focusing on a specific region is helpful in eliminating the influence of identity, but to be risk for losing information. It is difficult to find balance. In this study, I propose a simple strategy. I divide the face into three large regions, upper, middle and lower region, and group AUs based on where it occurs. I propose a new end-to-end deep learning framework named three regions based attention network (TRA-Net). After extracting the global feature, TRA-Net uses a hard attention module to extract three feature maps, each of which contains only a specific region. Each region-specific feature map is fed to an independent branch. For each branch, three continuous soft attention modules are used to extract higher-level features for final AU detection. In the DISFA dataset, this model achieves the highest F1 scores for the detection of AU1, AU2 and AU4, and produces the highest accuracy in comparison with the state-of-the-art methods.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset