Rarity Score : A New Metric to Evaluate the Uncommonness of Synthesized Images

by   Jiyeon Han, et al.

Evaluation metrics in image synthesis play a key role to measure performances of generative models. However, most metrics mainly focus on image fidelity. Existing diversity metrics are derived by comparing distributions, and thus they cannot quantify the diversity or rarity degree of each generated image. In this work, we propose a new evaluation metric, called `rarity score', to measure the individual rarity of each image synthesized by generative models. We first show empirical observation that common samples are close to each other and rare samples are far from each other in nearest-neighbor distances of feature space. We then use our metric to demonstrate that the extent to which different generative models produce rare images can be effectively compared. We also propose a method to compare rarities between datasets that share the same concept such as CelebA-HQ and FFHQ. Finally, we analyze the use of metrics in different designs of feature spaces to better understand the relationship between feature spaces and resulting sparse images. Code will be publicly available online for the research community.


page 14

page 15

page 16

page 17

page 18

page 19

page 20

page 21


Unsupervised evaluation of GAN sample quality: Introducing the TTJac Score

Evaluation metrics are essential for assessing the performance of genera...

Random Network Distillation as a Diversity Metric for Both Image and Text Generation

Generative models are increasingly able to produce remarkably high quali...

Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models

We systematically study a wide variety of image-based generative models ...

F?D: On understanding the role of deep feature spaces on face generation evaluation

Perceptual metrics, like the Fréchet Inception Distance (FID), are widel...

On the Evaluation of Generative Models in High Energy Physics

There has been a recent explosion in research into machine-learning-base...

Emergent Asymmetry of Precision and Recall for Measuring Fidelity and Diversity of Generative Models in High Dimensions

Precision and Recall are two prominent metrics of generative performance...

Barcode Method for Generative Model Evaluation driven by Topological Data Analysis

Evaluating the performance of generative models in image synthesis is a ...

Please sign up or login with your details

Forgot password? Click here to reset