The SARAS Endoscopic Surgeon Action Detection (ESAD) dataset: Challenges and methods

by   Vivek Singh Bawa, et al.

For an autonomous robotic system, monitoring surgeon actions and assisting the main surgeon during a procedure can be very challenging. The challenges come from the peculiar structure of the surgical scene, the greater similarity in appearance of actions performed via tools in a cavity compared to, say, human actions in unconstrained environments, as well as from the motion of the endoscopic camera. This paper presents ESAD, the first large-scale dataset designed to tackle the problem of surgeon action detection in endoscopic minimally invasive surgery. ESAD aims at contributing to increase the effectiveness and reliability of surgical assistant robots by realistically testing their awareness of the actions performed by a surgeon. The dataset provides bounding box annotation for 21 action classes on real endoscopic video frames captured during prostatectomy, and was used as the basis of a recent MIDL 2020 challenge. We also present an analysis of the dataset conducted using the baseline model which was released as part of the challenge, and a description of the top performing models submitted to the challenge together with the results they obtained. This study provides significant insight into what approaches can be effective and can be extended further. We believe that ESAD will serve in the future as a useful benchmark for all researchers active in surgeon action detection and assistive robotics at large.


page 8

page 10

page 13

page 14

page 20

page 23

page 24

page 25


ESAD: Endoscopic Surgeon Action Detection Dataset

In this work, we take aim towards increasing the effectiveness of surgic...

CholecTriplet2022: Show me a tool and tell me the triplet – an endoscopic vision challenge for surgical action triplet detection

Formalizing surgical activities as triplets of the used instruments, act...

UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild

We introduce UCF101 which is currently the largest dataset of human acti...

Spatio-Temporal Action Detection Under Large Motion

Current methods for spatiotemporal action tube detection often extend a ...

Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection

Despite significant progress in the development of human action detectio...

Unsupervised identification of surgical robotic actions from small non homogeneous datasets

Robot-assisted surgery is an established clinical practice. The automati...

Going Deeper into Recognizing Actions in Dark Environments: A Comprehensive Benchmark Study

While action recognition (AR) has gained large improvements with the int...

Please sign up or login with your details

Forgot password? Click here to reset