LABRAD-OR: Lightweight Memory Scene Graphs for Accurate Bimodal Reasoning in Dynamic Operating Rooms

by   Ege Özsoy, et al.

Modern surgeries are performed in complex and dynamic settings, including ever-changing interactions between medical staff, patients, and equipment. The holistic modeling of the operating room (OR) is, therefore, a challenging but essential task, with the potential to optimize the performance of surgical teams and aid in developing new surgical technologies to improve patient outcomes. The holistic representation of surgical scenes as semantic scene graphs (SGG), where entities are represented as nodes and relations between them as edges, is a promising direction for fine-grained semantic OR understanding. We propose, for the first time, the use of temporal information for more accurate and consistent holistic OR modeling. Specifically, we introduce memory scene graphs, where the scene graphs of previous time steps act as the temporal representation guiding the current prediction. We design an end-to-end architecture that intelligently fuses the temporal information of our lightweight memory scene graphs with the visual information from point clouds and images. We evaluate our method on the 4D-OR dataset and demonstrate that integrating temporality leads to more accurate and consistent results achieving an +5 the path for representing the entire surgery history with memory scene graphs and improves the holistic understanding in the OR. Introducing scene graphs as memory representations can offer a valuable tool for many temporal understanding tasks.


4D-OR: Semantic Scene Graphs for OR Domain Modeling

Surgical procedures are conducted in highly complex operating rooms (OR)...

Towards Holistic Surgical Scene Understanding

Most benchmarks for studying surgical interventions focus on a specific ...

Constructing Holistic Spatio-Temporal Scene Graph for Video Semantic Role Labeling

Video Semantic Role Labeling (VidSRL) aims to detect the salient events ...

Incremental 3D Semantic Scene Graph Prediction from RGB Sequences

3D semantic scene graphs are a powerful holistic representation as they ...

Enhancing Social Relation Inference with Concise Interaction Graph and Discriminative Scene Representation

There has been a recent surge of research interest in attacking the prob...

Bidirectional Projection Network for Cross Dimension Scene Understanding

2D image representations are in regular grids and can be processed effic...

DisguisOR: Holistic Face Anonymization for the Operating Room

Purpose: Recent advances in Surgical Data Science (SDS) have contributed...

Please sign up or login with your details

Forgot password? Click here to reset