Grammar-compressed Self-index with Lyndon Words

04/11/2020
by   Kazuya Tsuruta, et al.
0

We introduce a new class of straight-line programs (SLPs), named the Lyndon SLP, inspired by the Lyndon trees (Barcelo, 1990). Based on this SLP, we propose a self-index data structure of O(g) words of space that can be built from a string T in O(n + g g) time, retrieving the starting positions of all occurrences of a pattern P of length m in O(m + m n + occ g) time, where n is the length of T, g is the size of the Lyndon SLP for T, and occ is the number of occurrences of P in T.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2021

Compression by Contracting Straight-Line Programs

In grammar-based compression a string is represented by a context-free g...
research
03/26/2018

Universal Compressed Text Indexing

The rise of repetitive datasets has lately generated a lot of interest i...
research
06/09/2020

Optimal-Time Queries on BWT-runs Compressed Indexes

Although a significant number of compressed indexes for highly repetitiv...
research
12/20/2017

Text Indexing and Searching in Sublinear Time

We introduce the first index that can be built in o(n) time for a text o...
research
05/28/2021

Grammar Index By Induced Suffix Sorting

Pattern matching is the most central task for text indices. Most recent ...
research
09/08/2018

Fully-Functional Suffix Trees and Optimal Text Searching in BWT-runs Bounded Space

Indexing highly repetitive texts --- such as genomic databases, software...
research
02/15/2018

Grammar-based Compression of Unranked Trees

We introduce forest straight-line programs (FSLPs) as a compressed repre...

Please sign up or login with your details

Forgot password? Click here to reset