Cem Mil Podcasts: A Spoken Portuguese Document Corpus

09/23/2022
by   Edgar Tanaka, et al.
0

This document describes the Portuguese language podcast dataset released by Spotify for academic research purposes. We give an overview of how the data was sampled, some basic statistics over the collection, as well as brief information of distribution over Brazilian and Portuguese dialects.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset