Towards practical lipreading with distilled and efficient models

by   Pingchuan Ma, et al.

Lipreading has witnessed a lot of progress due to the resurgence of neural networks. Recent work has placed emphasis on aspects such as improving performance by finding the optimal architecture or improving generalization. However, there is still a significant gap between the current methodologies and the requirements for an effective deployment of lipreading in practical scenarios. In this work, we propose a series of innovations that significantly bridge that gap: first, we raise the state-of-the-art performance by a wide margin on LRW and LRW-1000 to 88.6 optimization. Secondly, we propose a series of architectural changes, including a novel depthwise-separable TCN head, that slashes the computational cost to a fraction of the (already quite efficient) original model. Thirdly, we show that knowledge distillation is a very effective tool for recovering performance of the lightweight models. This results in a range of models with different accuracy-efficiency trade-offs. However, our most promising lightweight models are on par with the current state-of-the-art while showing a reduction of 8 and 4x in terms of computational cost and number of parameters, respectively, which we hope will enable the deployment of lipreading models in practical applications.


page 1

page 2

page 3

page 4


Compression of convolutional neural networks for high performance imagematching tasks on mobile devices

Deep neural networks have demonstrated state-of-the-art performance for ...

Knowledge distillation: A good teacher is patient and consistent

There is a growing discrepancy in computer vision between large-scale mo...

PSDNet and DPDNet: Efficient channel expansion, Depthwise-Pointwise-Depthwise Inverted Bottleneck Block

In many real-time applications, the deployment of deep neural networks i...

Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation

The latency of neural ranking models at query time is largely dependent ...

InDistill: Transferring Knowledge From Pruned Intermediate Layers

Deploying deep neural networks on hardware with limited resources, such ...

Learning Accurate Performance Predictors for Ultrafast Automated Model Compression

In this paper, we propose an ultrafast automated model compression frame...

Efficient Neural Net Approaches in Metal Casting Defect Detection

One of the most pressing challenges prevalent in the steel manufacturing...

Please sign up or login with your details

Forgot password? Click here to reset