Learning from a tiny dataset of manual annotations: a teacher/student approach for surgical phase recognition

by   Tong Yu, et al.
Université de Strasbourg

Vision algorithms capable of interpreting scenes from a real-time video stream are necessary for computer-assisted surgery systems to achieve context-aware behavior. In laparoscopic procedures one particular algorithm needed for such systems is the identification of surgical phases, for which the current state of the art is a model based on a CNN-LSTM. A number of previous works using models of this kind have trained them in a fully supervised manner, requiring a fully annotated dataset. Instead, our work confronts the problem of learning surgical phase recognition in scenarios presenting scarce amounts of annotated data (under 25 teacher/student type of approach, where a strong predictor called the teacher, trained beforehand on a small dataset of ground truth-annotated videos, generates synthetic annotations for a larger dataset, which another model - the student - learns from. In our case, the teacher features a novel CNN-biLSTM-CRF architecture, designed for offline inference only. The student, on the other hand, is a CNN-LSTM capable of making real-time predictions. Results for various amounts of manually annotated videos demonstrate the superiority of the new CNN-biLSTM-CRF predictor as well as improved performance from the CNN-LSTM trained using synthetic labels generated for unannotated videos. For both offline and online surgical phase recognition with very few annotated recordings available, this new teacher/student strategy provides a valuable performance improvement by efficiently leveraging the unannotated data.


page 1

page 2

page 3

page 4


Simulation-to-Real domain adaptation with teacher-student learning for endoscopic instrument segmentation

Purpose: Segmentation of surgical instruments in endoscopic videos is es...

Self-Knowledge Distillation for Surgical Phase Recognition

Purpose: Advances in surgical phase recognition are generally led by tra...

Teaching AI to Teach: Leveraging Limited Human Salience Data Into Unlimited Saliency-Based Training

Machine learning models have shown increased accuracy in classification ...

Rethinking Generalization Performance of Surgical Phase Recognition with Expert-Generated Annotations

As the area of application of deep neural networks expands to areas requ...

Less is More: Surgical Phase Recognition from Timestamp Supervision

Surgical phase recognition is a fundamental task in computer-assisted su...

FSOCO: The Formula Student Objects in Context Dataset

This paper presents the FSOCO dataset, a collaborative dataset for visio...

Please sign up or login with your details

Forgot password? Click here to reset