EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos

by   Andru P. Twinanda, et al.
Université de Strasbourg

Surgical workflow recognition has numerous potential medical applications, such as the automatic indexing of surgical video databases and the optimization of real-time operating room scheduling, among others. As a result, phase recognition has been studied in the context of several kinds of surgeries, such as cataract, neurological, and laparoscopic surgeries. In the literature, two types of features are typically used to perform this task: visual features and tool usage signals. However, the visual features used are mostly handcrafted. Furthermore, the tool usage signals are usually collected via a manual annotation process or by using additional equipment. In this paper, we propose a novel method for phase recognition that uses a convolutional neural network (CNN) to automatically learn features from cholecystectomy videos and that relies uniquely on visual information. In previous studies, it has been shown that the tool signals can provide valuable information in performing the phase recognition task. Thus, we present a novel CNN architecture, called EndoNet, that is designed to carry out the phase recognition and tool presence detection tasks in a multi-task manner. To the best of our knowledge, this is the first work proposing to use a CNN for multiple recognition tasks on laparoscopic videos. Extensive experimental comparisons to other methods show that EndoNet yields state-of-the-art results for both tasks.


page 6

page 8


Single- and Multi-Task Architectures for Surgical Workflow Challenge at M2CAI 2016

The surgical workflow challenge at M2CAI 2016 consists of identifying 8 ...

Multi-Task Recurrent Convolutional Network with Correlation Loss for Surgical Video Analysis

Surgical tool presence detection and surgical phase recognition are two ...

Single- and Multi-Task Architectures for Tool Presence Detection Challenge at M2CAI 2016

The tool presence detection challenge at M2CAI 2016 consists of identify...

Surgical Phase Recognition of Short Video Shots Based on Temporal Modeling of Deep Features

Recognizing the phases of a laparoscopic surgery (LS) operation form its...

Monitoring tool usage in cataract surgery videos using boosted convolutional and recurrent neural networks

With an estimated 19 million operations performed annually, cataract sur...

Real-time analysis of cataract surgery videos using statistical models

The automatic analysis of the surgical process, from videos recorded dur...

Please sign up or login with your details

Forgot password? Click here to reset