Mining the Benefits of Two-stage and One-stage HOI Detection

08/11/2021
by   Aixi Zhang, et al.
0

Two-stage methods have dominated Human-Object Interaction (HOI) detection for several years. Recently, one-stage HOI detection methods have become popular. In this paper, we aim to explore the essential pros and cons of two-stage and one-stage methods. With this as the goal, we find that conventional two-stage methods mainly suffer from positioning positive interactive human-object pairs, while one-stage methods are challenging to make an appropriate trade-off on multi-task learning, i.e., object detection, and interaction classification. Therefore, a core problem is how to take the essence and discard the dregs from the conventional two types of methods. To this end, we propose a novel one-stage framework with disentangling human-object detection and interaction classification in a cascade manner. In detail, we first design a human-object pair generator based on a state-of-the-art one-stage HOI detector by removing the interaction classification module or head and then design a relatively isolated interaction classifier to classify each human-object pair. Two cascade decoders in our proposed framework can focus on one specific task, detection or interaction classification. In terms of the specific implementation, we adopt a transformer-based HOI detector as our base model. The newly introduced disentangling paradigm outperforms existing methods by a large margin, with a significant relative mAP gain of 9.32

READ FULL TEXT
research
10/02/2020

DIRV: Dense Interaction Region Voting for End-to-End Human-Object Interaction Detection

Recent years, human-object interaction (HOI) detection has achieved impr...
research
12/30/2019

PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection

We propose a single-stage Human-Object Interaction (HOI) detection metho...
research
08/26/2023

Joint Gaze-Location and Gaze-Object Detection

This paper proposes an efficient and effective method for joint gaze loc...
research
12/03/2021

Efficient Two-Stage Detection of Human-Object Interactions with a Novel Unary-Pairwise Transformer

Recent developments in transformer models for visual data have led to si...
research
11/02/2019

Visual Relationship Detection with Relative Location Mining

Visual relationship detection, as a challenging task used to find and di...
research
04/17/2023

ViPLO: Vision Transformer based Pose-Conditioned Self-Loop Graph for Human-Object Interaction Detection

Human-Object Interaction (HOI) detection, which localizes and infers rel...
research
03/09/2020

Cascaded Human-Object Interaction Recognition

Rapid progress has been witnessed for human-object interaction (HOI) rec...

Please sign up or login with your details

Forgot password? Click here to reset