Few-Shot Drum Transcription in Polyphonic Music

08/06/2020
by   Yu Wang, et al.
0

Data-driven approaches to automatic drum transcription (ADT) are often limited to a predefined, small vocabulary of percussion instrument classes. Such models cannot recognize out-of-vocabulary classes nor are they able to adapt to finer-grained vocabularies. In this work, we address open vocabulary ADT by introducing few-shot learning to the task. We train a Prototypical Network on a synthetic dataset and evaluate the model on multiple real-world ADT datasets with polyphonic accompaniment. We show that, given just a handful of selected examples at inference time, we can match and in some cases outperform a state-of-the-art supervised ADT approach under a fixed vocabulary setting. At the same time, we show that our model can successfully generalize to finer-grained or extended vocabularies unseen during training, a scenario where supervised approaches cannot operate at all. We provide a detailed analysis of our experimental results, including a breakdown of performance by sound class and by polyphony.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2016

Semi-supervised Vocabulary-informed Learning

Despite significant progress in object categorization, in recent years, ...
research
11/08/2018

Few-shot learning with attention-based sequence-to-sequence models

End-to-end approaches have recently become popular as a means of simplif...
research
01/03/2023

Vocabulary-informed Zero-shot and Open-set Learning

Despite significant progress in object categorization, in recent years, ...
research
12/16/2017

Train Once, Test Anywhere: Zero-Shot Learning for Text Classification

Zero-shot Learners are models capable of predicting unseen classes. In t...
research
09/16/2020

Knowledge Guided Learning: Towards Open Domain Egocentric Action Recognition with Zero Supervision

Advances in deep learning have enabled the development of models that ha...
research
05/28/2017

Vocabulary-informed Extreme Value Learning

The novel unseen classes can be formulated as the extreme values of know...
research
03/20/2023

Open-vocabulary Panoptic Segmentation with Embedding Modulation

Open-vocabulary image segmentation is attracting increasing attention du...

Please sign up or login with your details

Forgot password? Click here to reset