Bidirectional Text Compression in External Memory

by   Patrick Dinklage, et al.

Bidirectional compression algorithms work by substituting repeated substrings by references that, unlike in the famous LZ77-scheme, can point to either direction. We present such an algorithm that is particularly suited for an external memory implementation. We evaluate it experimentally on large data sets of size up to 128 GiB (using only 16 GiB of RAM) and show that it is significantly faster than all known LZ77 compressors, while producing a roughly similar number of factors. We also introduce an external memory decompressor for texts compressed with any uni- or bidirectional compression scheme.


page 1

page 2

page 3

page 4


Efficient Construction of the BWT for Repetitive Text Using String Compression

We present a new semi-external algorithm that builds the Burrows-Wheeler...

O(n n)-time text compression by LZ-style longest first substitution

Mauer et al. [A Lempel-Ziv-style Compression Method for Repetitive Texts...

Optimal alphabet for single text compression

A text can be viewed via different representations, i.e. as a sequence o...

Re-Pair In-Place

Re-Pair is a grammar compression scheme with favorably good compression ...

Approximating Optimal Bidirectional Macro Schemes

Lempel-Ziv is an easy-to-compute member of a wide family of so-called ma...

Preserved central model for faster bidirectional compression in distributed settings

We develop a new approach to tackle communication constraints in a distr...

M2T: Masking Transformers Twice for Faster Decoding

We show how bidirectional transformers trained for masked token predicti...

Please sign up or login with your details

Forgot password? Click here to reset