Enabling ISP-less Low-Power Computer Vision

by   Gourav datta, et al.

In order to deploy current computer vision (CV) models on resource-constrained low-power devices, recent works have proposed in-sensor and in-pixel computing approaches that try to partly/fully bypass the image signal processor (ISP) and yield significant bandwidth reduction between the image sensor and the CV processing unit by downsampling the activation maps in the initial convolutional neural network (CNN) layers. However, direct inference on the raw images degrades the test accuracy due to the difference in covariance of the raw images captured by the image sensors compared to the ISP-processed images used for training. Moreover, it is difficult to train deep CV models on raw images, because most (if not all) large-scale open-source datasets consist of RGB images. To mitigate this concern, we propose to invert the ISP pipeline, which can convert the RGB images of any dataset to its raw counterparts, and enable model training on raw images. We release the raw version of the COCO dataset, a large-scale benchmark for generic high-level vision tasks. For ISP-less CV systems, training on these raw images result in a 7.1 to relying on training with traditional ISP-processed RGB datasets. To further improve the accuracy of ISP-less CV models and to increase the energy and bandwidth benefits obtained by in-sensor/in-pixel computing, we propose an energy-efficient form of analog in-pixel demosaicing that may be coupled with in-pixel CNN computations. When evaluated on raw images captured by real sensors from the PASCALRAW dataset, our approach results in a 8.1 mAP. Lastly, we demonstrate a further 20.5 application of few-shot learning with thirty shots each for the novel PASCALRAW dataset, constituting 3 classes.


page 3

page 5


Efficient Visual Computing with Camera RAW Snapshots

Conventional cameras capture image irradiance on a sensor and convert it...

Self-Supervised Reversed Image Signal Processing via Reference-Guided Dynamic Parameter Selection

Unprocessed sensor outputs (RAW images) potentially improve both low-lev...

CIE XYZ Net: Unprocessing Images for Low-Level Computer Vision Tasks

Cameras currently allow access to two image states: (i) a minimally proc...

Raw or Cooked? Object Detection on RAW Images

Images fed to a deep neural network have in general undergone several ha...

Toward Efficient Hyperspectral Image Processing inside Camera Pixels

Hyperspectral cameras generate a large amount of data due to the presenc...

ISP Distillation

Nowadays, many of the images captured are "observed" by machines only an...

Object Motion Sensitivity: A Bio-inspired Solution to the Ego-motion Problem for Event-based Cameras

Neuromorphic (event-based) image sensors draw inspiration from the human...

Please sign up or login with your details

Forgot password? Click here to reset