Proximity Full-Text Search with a Response Time Guarantee by Means of Additional Indexes

Full-text search engines are important tools for information retrieval. Term proximity is an important factor in relevance score measurement. In a proximity full-text search, we assume that a relevant document contains query terms near each other, especially if the query terms are frequently occurring words. A methodology for high-performance full-text query execution is discussed. We build additional indexes to achieve better efficiency. For a word that occurs in the text, we include in the indexes some information about nearby words. What types of additional indexes do we use? How do we use them? These questions are discussed in this work. We present the results of experiments showing that the average time of search query execution is 44-45 times less than that required when using ordinary inverted indexes. This is a pre-print of a contribution "Veretennikov A.B. Proximity Full-Text Search with a Response Time Guarantee by Means of Additional Indexes" published in "Arai K., Kapoor S., Bhatia R. (eds) Intelligent Systems and Applications. IntelliSys 2018. Advances in Intelligent Systems and Computing, vol 868" published by Springer, Cham. The final authenticated version is available online at: https://doi.org/10.1007/978-3-030-01054-6_66. The work was supported by Act 211 Government of the Russian Federation, contract no 02.A03.21.0006.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2020

Proximity full-text searches of frequently occurring words with a response time guarantee

Full-text search engines are important tools for information retrieval. ...
research
12/18/2018

Proximity Full-Text Search with a Response Time Guarantee by Means of Additional Indexes with Multi-Component Keys

Full-text search engines are important tools for information retrieval. ...
research
09/06/2020

An Improved Algorithm for Fast K-Word Proximity Search Based on Multi-Component Key Indexes

A search query consists of several words. In a proximity full-text searc...
research
01/09/2021

Selection of Optimal Parameters in the Fast K-Word Proximity Search Based on Multi-component Key Indexes

Proximity full-text search is commonly implemented in contemporary full-...
research
08/01/2021

Relevance ranking for proximity full-text search based on additional indexes with multi-component keys

The problem of proximity full-text search is considered. If a search que...
research
01/27/2018

Using Additional Indexes for Fast Full-Text Search of Phrases That Contains Frequently Used Words

Searches for phrases and word sets in large text arrays by means of addi...
research
07/15/2018

Discovering Latent Concepts and Exploiting Ontological Features for Semantic Text Search

Named entities and WordNet words are important in defining the content o...

Please sign up or login with your details

Forgot password? Click here to reset