Weakly Supervised Temporal Convolutional Networks for Fine-grained Surgical Activity Recognition

by   Sanat Ramesh, et al.
University of Verona
Université de Strasbourg

Automatic recognition of fine-grained surgical activities, called steps, is a challenging but crucial task for intelligent intra-operative computer assistance. The development of current vision-based activity recognition methods relies heavily on a high volume of manually annotated data. This data is difficult and time-consuming to generate and requires domain-specific knowledge. In this work, we propose to use coarser and easier-to-annotate activity labels, namely phases, as weak supervision to learn step recognition with fewer step annotated videos. We introduce a step-phase dependency loss to exploit the weak supervision signal. We then employ a Single-Stage Temporal Convolutional Network (SS-TCN) with a ResNet-50 backbone, trained in an end-to-end fashion from weakly annotated videos, for temporal activity segmentation and recognition. We extensively evaluate and show the effectiveness of the proposed method on a large video dataset consisting of 40 laparoscopic gastric bypass procedures and the public benchmark CATARACTS containing 50 cataract surgeries.


page 1

page 2

page 9


Fine-grained Activity Recognition in Baseball Videos

In this paper, we introduce a challenging new dataset, MLB-YouTube, desi...

Activity Detection in Long Surgical Videos using Spatio-Temporal Models

Automatic activity detection is an important component for developing te...

Less is More: Surgical Phase Recognition from Timestamp Supervision

Surgical phase recognition is a fundamental task in computer-assisted su...

Automated Surgical Activity Recognition with One Labeled Sequence

Prior work has demonstrated the feasibility of automated activity recogn...

Multi-level Contrast Network for Wearables-based Joint Activity Segmentation and Recognition

Human activity recognition (HAR) with wearables is promising research th...

Attention is All We Need: Nailing Down Object-centric Attention for Egocentric Activity Recognition

In this paper we propose an end-to-end trainable deep neural network mod...

Please sign up or login with your details

Forgot password? Click here to reset