Consistency Training of Multi-exit Architectures for Sensor Data
Deep neural networks have become larger over the years with increasing demand of computational resources for inference; incurring exacerbate costs and leaving little room for deployment on devices with limited battery and other resources for real-time applications. The multi-exit architectures are type of deep neural network that are interleaved with several output (or exit) layers at varying depths of the model. They provide a sound approach for improving computational time and energy utilization of running a model through producing predictions from early exits. In this work, we present a novel and architecture-agnostic approach for robust training of multi-exit architectures termed consistent exit training. The crux of the method lies in a consistency-based objective to enforce prediction invariance over clean and perturbed inputs. We leverage weak supervision to align model output with consistency training and jointly optimize dual-losses in a multi-task learning fashion over the exits in a network. Our technique enables exit layers to generalize better when confronted with increasing uncertainty, hence, resulting in superior quality-efficiency trade-offs. We demonstrate through extensive evaluation on challenging learning tasks involving sensor data that our approach allows examples to exit earlier with better detection rate and without executing all the layers in a deep model.
READ FULL TEXT