A Study on the Evaluation of Generative Models

06/22/2022
by   Eyal Betzalel, et al.
0

Implicit generative models, which do not return likelihood values, such as generative adversarial networks and diffusion models, have become prevalent in recent years. While it is true that these models have shown remarkable results, evaluating their performance is challenging. This issue is of vital importance to push research forward and identify meaningful gains from random noise. Currently, heuristic metrics such as the Inception score (IS) and Frechet Inception Distance (FID) are the most common evaluation metrics, but what they measure is not entirely clear. Additionally, there are questions regarding how meaningful their score actually is. In this work, we study the evaluation metrics of generative models by generating a high-quality synthetic dataset on which we can estimate classical metrics for comparison. Our study shows that while FID and IS do correlate to several f-divergences, their ranking of close models can vary considerably making them problematic when used for fain-grained comparison. We further used this experimental setting to study which evaluation metric best correlates with our probabilistic metrics. Lastly, we look into the base features used for metrics such as FID.

READ FULL TEXT

page 4

page 7

page 11

research
01/06/2018

A Note on the Inception Score

Deep generative models are powerful tools that have produced impressive ...
research
08/10/2022

Evaluating Generatively Synthesized Diabetic Retinopathy Imagery

Publicly available data for the training of diabetic retinopathy classif...
research
09/04/2018

Handwriting styles: benchmarks and evaluation metrics

Evaluating the style of handwriting generation is a challenging problem,...
research
11/16/2019

Effectively Unbiased FID and Inception Score and where to find them

This paper shows that two commonly used evaluation metrics for generativ...
research
08/31/2022

Evaluating generative audio systems and their metrics

Recent years have seen considerable advances in audio synthesis with dee...
research
04/26/2020

Evaluation Metrics for Conditional Image Generation

We present two new metrics for evaluating generative models in the class...
research
06/29/2022

Can Push-forward Generative Models Fit Multimodal Distributions?

Many generative models synthesize data by transforming a standard Gaussi...

Please sign up or login with your details

Forgot password? Click here to reset