Cosine similarity-based adversarial process

by   Hee-Soo Heo, et al.

An adversarial process between two deep neural networks is a promising approach to train a robust model. In this paper, we propose an adversarial process using cosine similarity, whereas conventional adversarial processes are based on inverted categorical cross entropy (CCE). When used for training an identification model, the adversarial process induces the competition of two discriminative models; one for a primary task such as speaker identification or image recognition, the other one for a subsidiary task such as channel identification or domain identification. In particular, the adversarial process degrades the performance of the subsidiary model by eliminating the subsidiary information in the input which, in assumption, may degrade the performance of the primary model. The conventional adversarial processes maximize the CCE of the subsidiary model to degrade the performance. We have studied a framework for training robust discriminative models by eliminating channel or domain information (subsidiary information) by applying such an adversarial process. However, we found through experiments that using the process of maximizing the CCE does not guarantee the performance degradation of the subsidiary model. In the proposed adversarial process using cosine similarity, on the contrary, the performance of the subsidiary model can be degraded more efficiently by searching feature space orthogonal to the subsidiary model. The experiments on speaker identification and image recognition show that we found features that make the outputs of the subsidiary models independent of the input, and the performances of the primary models are improved.


page 1

page 2

page 3

page 4


Deep Speaker: an End-to-End Neural Speaker Embedding System

We present Deep Speaker, a neural speaker embedding system that maps utt...

Pitch-synchronous DCT features: A pilot study on speaker identification

We propose a new feature, namely, pitchsynchronous discrete cosine trans...

Cosine-Distance Virtual Adversarial Training for Semi-Supervised Speaker-Discriminative Acoustic Embeddings

In this paper, we propose a semi-supervised learning (SSL) technique for...

FoolHD: Fooling speaker identification by Highly imperceptible adversarial Disturbances

Speaker identification models are vulnerable to carefully designed adver...

Speaker-Invariant Training via Adversarial Learning

We propose a novel adversarial multi-task learning scheme, aiming at act...

Scoring of Large-Margin Embeddings for Speaker Verification: Cosine or PLDA?

The emergence of large-margin softmax cross-entropy losses in training d...

Exploring the Sharpened Cosine Similarity

Convolutional layers have long served as the primary workhorse for image...

Please sign up or login with your details

Forgot password? Click here to reset