On the Pitfalls of Batch Normalization for End-to-End Video Learning: A Study on Surgical Workflow Analysis

by   Dominik Rivoir, et al.

Batch Normalization's (BN) unique property of depending on other samples in a batch is known to cause problems in several tasks, including sequential modeling, and has led to the use of alternatives in these fields. In video learning, however, these problems are less studied, despite the ubiquitous use of BN in CNNs for visual feature extraction. We argue that BN's properties create major obstacles for training CNNs and temporal models end to end in video tasks. Yet, end-to-end learning seems preferable in specialized domains such as surgical workflow analysis, which lack well-pretrained feature extractors. While previous work in surgical workflow analysis has avoided BN-related issues through complex, multi-stage learning procedures, we show that even simple, end-to-end CNN-LSTMs can outperform the state of the art when CNNs without BN are used. Moreover, we analyze in detail when BN-related issues occur, including a "cheating" phenomenon in surgical anticipation tasks. We hope that a deeper understanding of BN's limitations and a reconsideration of end-to-end approaches can be beneficial for future research in surgical workflow analysis and general video learning.


page 1

page 2

page 3

page 4


Not End-to-End: Explore Multi-Stage Architecture for Online Surgical Phase Recognition

Surgical phase recognition is of particular interest to computer assiste...

Single- and Multi-Task Architectures for Surgical Workflow Challenge at M2CAI 2016

The surgical workflow challenge at M2CAI 2016 consists of identifying 8 ...

LRTD: Long-Range Temporal Dependency based Active Learning for Surgical Workflow Recognition

Automatic surgical workflow recognition in video is an essentially funda...

SUrgical PRediction GAN for Events Anticipation

Comprehension of surgical workflow is the foundation upon which computer...

Surgical Workflow Recognition: from Analysis of Challenges to Architectural Study

Algorithmic surgical workflow recognition is an ongoing research field a...

Surgical Phase Recognition of Short Video Shots Based on Temporal Modeling of Deep Features

Recognizing the phases of a laparoscopic surgery (LS) operation form its...

Term Set Expansion based on Multi-Context Term Embeddings: an End-to-end Workflow

We present SetExpander, a corpus-based system for expanding a seed set o...

Please sign up or login with your details

Forgot password? Click here to reset