Recipe2Vec: Multi-modal Recipe Representation Learning with Graph Neural Networks

05/24/2022
by   Yijun Tian, et al.
0

Learning effective recipe representations is essential in food studies. Unlike what has been developed for image-based recipe retrieval or learning structural text embeddings, the combined effect of multi-modal information (i.e., recipe images, text, and relation data) receives less attention. In this paper, we formalize the problem of multi-modal recipe representation learning to integrate the visual, textual, and relational information into recipe embeddings. In particular, we first present Large-RG, a new recipe graph data with over half a million nodes, making it the largest recipe graph to date. We then propose Recipe2Vec, a novel graph neural network based recipe embedding model to capture multi-modal information. Additionally, we introduce an adversarial attack strategy to ensure stable learning and improve performance. Finally, we design a joint objective function of node classification and adversarial learning to optimize the model. Extensive experiments demonstrate that Recipe2Vec outperforms state-of-the-art baselines on two classic food study tasks, i.e., cuisine category classification and region prediction. Dataset and codes are available at https://github.com/meettyj/Recipe2Vec.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/26/2016

Image-Text Multi-Modal Representation Learning by Adversarial Backpropagation

We present novel method for image-text multi-modal representation learni...
research
08/20/2020

Multi-modal Cooking Workflow Construction for Food Recipes

Understanding food recipe requires anticipating the implicit causal effe...
research
10/04/2020

Multi-Modal Retrieval using Graph Neural Networks

Most real world applications of image retrieval such as Adobe Stock, whi...
research
09/10/2023

Multi-modal Extreme Classification

This paper develops the MUFIN technique for extreme classification (XC) ...
research
08/23/2022

Multi-Modal Representation Learning with Self-Adaptive Thresholds for Commodity Verification

In this paper, we propose a method to identify identical commodities. In...
research
09/15/2023

Unified Brain MR-Ultrasound Synthesis using Multi-Modal Hierarchical Representations

We introduce MHVAE, a deep hierarchical variational auto-encoder (VAE) t...
research
03/31/2022

A Rich Recipe Representation as Plan to Support Expressive Multi Modal Queries on Recipe Content and Preparation Process

Food is not only a basic human necessity but also a key factor driving a...

Please sign up or login with your details

Forgot password? Click here to reset