STDG: Semi-Teacher-Student Training Paradigram for Depth-guided One-stage Scene Graph Generation

by   Xukun Zhou, et al.

Scene Graph Generation is a critical enabler of environmental comprehension for autonomous robotic systems. Most of existing methods, however, are often thwarted by the intricate dynamics of background complexity, which limits their ability to fully decode the inherent topological information of the environment. Additionally, the wealth of contextual information encapsulated within depth cues is often left untapped, rendering existing approaches less effective. To address these shortcomings, we present STDG, an avant-garde Depth-Guided One-Stage Scene Graph Generation methodology. The innovative architecture of STDG is a triad of custom-built modules: The Depth Guided HHA Representation Generation Module, the Depth Guided Semi-Teaching Network Learning Module, and the Depth Guided Scene Graph Generation Module. This trifecta of modules synergistically harnesses depth information, covering all aspects from depth signal generation and depth feature utilization, to the final scene graph prediction. Importantly, this is achieved without imposing additional computational burden during the inference phase. Experimental results confirm that our method significantly enhances the performance of one-stage scene graph generation baselines.


page 1

page 3


CI-Net: Contextual Information for Joint Semantic Segmentation and Depth Estimation

Monocular depth estimation and semantic segmentation are two fundamental...

Computational Models for Multiview Dense Depth Maps of Dynamic Scene

This paper reviews the recent progresses of the depth map generation for...

Depth Structure Preserving Scene Image Generation

Key to automatically generate natural scene images is to properly arrang...

Explore Contextual Information for 3D Scene Graph Generation

3D scene graph generation (SGG) has been of high interest in computer vi...

S3Net: A Single Stream Structure for Depth Guided Image Relighting

Depth guided any-to-any image relighting aims to generate a relit image ...

D2SLAM: Semantic visual SLAM based on the influence of Depth for Dynamic environments

Taking into account the dynamics of the scene is the most effective solu...

Semantically Guided Depth Upsampling

We present a novel method for accurate and efficient up- sampling of spa...

Please sign up or login with your details

Forgot password? Click here to reset