Automatic generation of semantic corpora for improving intent estimation of taxonomy-driven search engines

03/30/2022
by   Lorenzo Massai, et al.
0

With the increasing demand of intelligent systems capable of operating in different user contexts (e.g. users on the move) the correct interpretation of the user-need by such systems has become crucial to give a consistent answer to the user query. The most effective techniques which are used to address such task are in the fields of natural language processing and semantic expansion of terms. Such systems are aimed at estimating the actual meaning of input queries, addressing the concepts of the words which are expressed within the user questions. The aim of this paper is to demonstrate which semantic relation impacts the most in semantic expansion-based retrieval systems and to identify the best tradeoff between accuracy and noise introduction when combining such relations. The evaluations are made building a simple natural language processing system capable of querying any taxonomy-driven domain, making use of the combination of different semantic expansions as knowledge resources. The proposed evaluation employs a wide and varied taxonomy as a use-case, exploiting its labels as basis for the expansions. To build the knowledge resources several corpora have been produced and integrated as gazetteers into the NLP infrastructure with the purpose of estimating the pseudo-queries corresponding to the taxonomy labels, considered as the possible intents.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset