KIND: an Italian Multi-Domain Dataset for Named Entity Recognition

12/30/2021
by   Teresa Paccosi, et al.
0

In this paper we present KIND, an Italian dataset for Named-Entity Recognition. It contains more than one million tokens with the annotation covering three classes: persons, locations, and organizations. Most of the dataset (around 600K tokens) contains manual gold annotations in three different domains: news, literature, and political discourses. Texts and annotations are downloadable for free from the Github repository.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset