Learning Image Aesthetic Assessment from Object-level Visual Components

04/04/2021
by   Jingwen Hou, et al.
0

As it is said by Van Gogh, great things are done by a series of small things brought together. Aesthetic experience arises from the aggregation of underlying visual components. However, most existing deep image aesthetic assessment (IAA) methods over-simplify the IAA process by failing to model image aesthetics with clearly-defined visual components as building blocks. As a result, the connection between resulting aesthetic predictions and underlying visual components is mostly invisible and hard to be explicitly controlled, which limits the model in both performance and interpretability. This work aims to model image aesthetics from the level of visual components. Specifically, object-level regions detected by a generic object detector are defined as visual components, namely object-level visual components (OVCs). Then generic features representing OVCs are aggregated for the aesthetic prediction based upon proposed object-level and graph attention mechanisms, which dynamically determines the importance of individual OVCs and relevance between OVC pairs, respectively. Experimental results confirm the superiority of our framework over previous relevant methods in terms of SRCC and PLCC on the aesthetic rating distribution prediction. Besides, quantitative analysis is done towards model interpretation by observing how OVCs contribute to aesthetic predictions, whose results are found to be supported by psychology on aesthetics and photography rules. To the best of our knowledge, this is the first attempt at the interpretation of a deep IAA model.

READ FULL TEXT

page 1

page 3

page 4

page 10

research
11/22/2022

Explaining YOLO: Leveraging Grad-CAM to Explain Object Detections

We investigate the problem of explainability for visual object detectors...
research
11/29/2017

Structured learning and detailed interpretation of minimal object images

We model the process of human full interpretation of object images, name...
research
01/27/2016

Deep Learning Driven Visual Path Prediction from a Single Image

Capabilities of inference and prediction are significant components of v...
research
07/26/2022

Is Attention Interpretation? A Quantitative Assessment On Sets

The debate around the interpretability of attention mechanisms is center...
research
02/07/2019

CHIP: Channel-wise Disentangled Interpretation of Deep Convolutional Neural Networks

With the widespread applications of deep convolutional neural networks (...
research
07/05/2018

Detecting Visual Relationships Using Box Attention

In this paper we propose a new model for detecting visual relationships....
research
06/27/2017

A Pig, an Angel and a Cactus Walk Into a Blender: A Descriptive Approach to Visual Blending

A descriptive approach for automatic generation of visual blends is pres...

Please sign up or login with your details

Forgot password? Click here to reset