The Performance Envelope of Inverted Indexing on Modern Hardware

10/24/2019
by   Jimmy Lin, et al.
0

This paper explores the performance envelope of "traditional" inverted indexing on modern hardware using the implementation in the open-source Lucene search library. We benchmark indexing throughput on a single high-end multi-core commodity server in a number of configurations varying the media of the source collection and target index, examining a network-attacked store, a direct-attached disk array, and an SSD. Experiments show that the largest determinants of performance are the physical characteristics of the source and target media, and that physically isolating the two yields the highest indexing throughput. Results suggest that current indexing techniques have reached physical device limits, and that further algorithmic improvements in performance are unlikely without rethinking the inverted indexing pipeline in light of observed bottlenecks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/17/2019

Comparison-Based Indexing From First Principles

Basic assumptions about comparison-based indexing are laid down and a ge...
research
05/30/2023

Known by the Company it Keeps: Proximity-Based Indexing for Physical Content in Archival Repositories

Despite the plethora of born-digital content, vast troves of important c...
research
05/25/2018

Dynamicity and Durability in Scalable Visual Instance Search

Visual instance search involves retrieving from a collection of images t...
research
08/21/2020

Metrics and Ambits and Sprawls, Oh My

A follow-up to my previous tutorial on metric indexing, this paper walks...
research
01/21/2019

Predictive Indexing

There has been considerable research on automated index tuning in databa...
research
05/22/2020

Spatial Indexing for System-Level Evaluation of 5G Heterogeneous Cellular Networks

System level simulations of large 5G networks are essential to evaluate ...
research
12/12/2017

Learning a Complete Image Indexing Pipeline

To work at scale, a complete image indexing system comprises two compone...

Please sign up or login with your details

Forgot password? Click here to reset