Language Resources and Technologies for Non-Scheduled and Endangered Indian Languages

by   Ritesh Kumar, et al.

In the present paper, we will present a survey of the language resources and technologies available for the non-scheduled and endangered languages of India. While there have been different estimates from different sources about the number of languages in India, it could be assumed that there are more than 1,000 languages currently being spoken in India. However barring some of the 22 languages included in the 8th Schedule of the Indian Constitution (called the scheduled languages), there is hardly any substantial resource or technology available for the rest of the languages. Nonetheless there have been some individual attempts at developing resources and technologies for the different languages across the country. Of late, some financial support has also become available for the endangered languages. In this paper, we give a summary of the resources and technologies for those Indian languages which are not included in the 8th schedule of the Indian Constitution and/or which are endangered.


page 6

page 7

page 8

page 9

page 10

page 12

page 13

page 14


NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages

Natural language processing (NLP) has a significant impact on society vi...

One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia

NLP research is impeded by a lack of resources and awareness of the chal...

Challenges in Developing LRs for Non-Scheduled Languages: A Case of Magahi

Magahi is an Indo-Aryan Language, spoken mainly in the Eastern parts of ...

Merging two Hierarchies of Internal Contextual Grammars with Subregular Selection

In this paper, we continue the research on the power of contextual gramm...

Challenges of language technologies for the indigenous languages of the Americas

Indigenous languages of the American continent are highly diverse. Howev...

Toward More Meaningful Resources for Lower-resourced Languages

In this position paper, we describe our perspective on how meaningful re...

Global Readiness of Language Technology for Healthcare: What would it Take to Combat the Next Pandemic?

The COVID-19 pandemic has brought out both the best and worst of languag...

Please sign up or login with your details

Forgot password? Click here to reset