Recent one-stage transformer-based methods achieve notable gains in the
...
In this paper, we present a cross-modal recipe retrieval framework,
Tran...
Human-object interaction (HOI) detection as a downstream of object detec...
In this paper, we tackle a challenging domain conversion task between ph...
To minimize the annotation costs associated with the training of semanti...
Bottom-up and top-down visual cues are two types of information that hel...
The character information in natural scene images contains various perso...