Long-tail Cross Modal Hashing

by   Zijun Gao, et al.
Shandong University
George Mason University

Existing Cross Modal Hashing (CMH) methods are mainly designed for balanced data, while imbalanced data with long-tail distribution is more general in real-world. Several long-tail hashing methods have been proposed but they can not adapt for multi-modal data, due to the complex interplay between labels and individuality and commonality information of multi-modal data. Furthermore, CMH methods mostly mine the commonality of multi-modal data to learn hash codes, which may override tail labels encoded by the individuality of respective modalities. In this paper, we propose LtCMH (Long-tail CMH) to handle imbalanced multi-modal data. LtCMH firstly adopts auto-encoders to mine the individuality and commonality of different modalities by minimizing the dependency between the individuality of respective modalities and by enhancing the commonality of these modalities. Then it dynamically combines the individuality and commonality with direct features extracted from respective modalities to create meta features that enrich the representation of tail labels, and binaries meta features to generate hash codes. LtCMH significantly outperforms state-of-the-art baselines on long-tail datasets and holds a better (or comparable) performance on datasets with balanced labels.


Meta Cross-Modal Hashing on Long-Tailed Data

Due to the advantage of reducing storage while speeding up query time on...

MTFH: A Matrix Tri-Factorization Hashing Framework for Efficient Cross-Modal Retrieval

Hashing has recently sparked a great revolution in cross-modal retrieval...

Ranking-based Deep Cross-modal Hashing

Cross-modal hashing has been receiving increasing interests for its low ...

Long-tail learning with attributes

Learning to classify images with unbalanced class distributions is chall...

Weakly-paired Cross-Modal Hashing

Hashing has been widely adopted for large-scale data retrieval in many d...

HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval

As the rapid growth of multi-modal data, hashing methods for cross-modal...

MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning

Audio-visual learning helps to comprehensively understand the world by f...

Please sign up or login with your details

Forgot password? Click here to reset