Language Detection Engine for Multilingual Texting on Mobile Devices

01/07/2021
by   Sourabh Vasant Gothe, et al.
0

More than 2 billion mobile users worldwide type in multiple languages in the soft keyboard. On a monolingual keyboard, 38 are valid in another language. This can be easily avoided by detecting the language of typed words and then validating it in its respective language. Language detection is a well-known problem in natural language processing. In this paper, we present a fast, light-weight and accurate Language Detection Engine (LDE) for multilingual typing that dynamically adapts to user intended language in real-time. We propose a novel approach where the fusion of character N-gram model and logistic regression based selector model is used to identify the language. Additionally, we present a unique method of reducing the inference time significantly by parameter reduction technique. We also discuss various optimizations fabricated across LDE to resolve ambiguity in input text among the languages with the same character pattern. Our method demonstrates an average accuracy of 94.5 for European languages on the code-switched data. This model outperforms fastText by 60.39 is faster on mobile device with an average inference time of 25.91 microseconds.

READ FULL TEXT

page 1

page 7

research
12/22/2016

Continuous multilinguality with language vectors

Most existing models for multilingual natural language processing (NLP) ...
research
01/07/2021

Real-Time Optimized N-gram For Mobile Devices

With the increasing number of mobile devices, there has been continuous ...
research
01/12/2017

LanideNN: Multilingual Language Identification on Character Window

In language identification, a common first step in natural language proc...
research
05/07/2021

Generalising Multilingual Concept-to-Text NLG with Language Agnostic Delexicalisation

Concept-to-text Natural Language Generation is the task of expressing an...
research
02/17/2023

Massively Multilingual Shallow Fusion with Large Language Models

While large language models (LLM) have made impressive progress in natur...
research
05/31/2023

Multilingual Multi-Figurative Language Detection

Figures of speech help people express abstract concepts and evoke strong...
research
09/24/2020

Novel Keyword Extraction and Language Detection Approaches

Fuzzy string matching and language classification are important tools in...

Please sign up or login with your details

Forgot password? Click here to reset