LSH Microbatches for Stochastic Gradients: Value in Rearrangement
Metric embeddings are immensely useful representation of interacting entities such as videos, users, search queries, online resources, words, and more. Embeddings are computed by optimizing a loss function of the form of a sum over provided associations so that relation of embedding vectors reflects strength of association. Moreover, the resulting embeddings allow us to predict the strength of unobserved associations. Typically, the optimization performs stochastic gradient updates on minibatches of associations that are arranged independently at random. We propose and study here the antithesis of coordinated arrangements, which we obtain efficiently through LSH microbatching, where similar associations are grouped together. Coordinated arrangements leverage the similarity of entities evident from their association vectors. We experimentally study the benefit of tunable minibatch arrangements, demonstrating consistent reductions of 3-15% in training. Arrangement emerges as a powerful performance knob for SGD that is orthogonal and compatible with other tuning methods, and thus is a candidate for wide deployment.
READ FULL TEXT