The Design, Implementation, and Deployment of a System to Transparently Compress Hundreds of Petabytes of Image Files for a File-Storage Service

by   Daniel Reiter Horn, et al.

We report the design, implementation, and deployment of Lepton, a fault-tolerant system that losslessly compresses JPEG images to 77 original size on average. Lepton replaces the lowest layer of baseline JPEG compression-a Huffman code-with a parallelized arithmetic code, so that the exact bytes of the original JPEG file can be recovered quickly. Lepton matches the compression efficiency of the best prior work, while decoding more than nine times faster and in a streaming manner. Lepton has been released as open-source software and has been deployed for a year on the Dropbox file-storage backend. As of February 2017, it had compressed more than 203 PiB of user JPEG files, saving more than 46 PiB.


Parallel decompression of gzip-compressed files and random access to DNA sequences

Decompressing a file made by the gzip program at an arbitrary location i...

Towards Marrying Files to Objects

To deal with the constant growth of unstructured data, vendors have depl...

MTFS: Merkle Tree based File System

The blockchain technology has been changing ourdaily lives since the cry...

Breaking Imphash

There are numerous schemes to generically signature artifacts. We specif...

FASTA/Q Data Compressors for MapReduce-Hadoop Genomics:Space and Time Savings Made Easy – Version 1

Motivation: Storage of genomic data is a major cost for the Life Science...

Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching

Gzip is a file compression format, which is ubiquitously used. Although ...

Animated GIF optimization by adaptive color local table management

After thirty years of the GIF file format, today is becoming more popula...

Please sign up or login with your details

Forgot password? Click here to reset