OmniCity: Omnipotent City Understanding with Multi-level and Multi-view Images

by   Weijia Li, et al.

This paper presents OmniCity, a new dataset for omnipotent city understanding from multi-level and multi-view images. More precisely, the OmniCity contains multi-view satellite images as well as street-level panorama and mono-view images, constituting over 100K pixel-wise annotated images that are well-aligned and collected from 25K geo-locations in New York City. To alleviate the substantial pixel-wise annotation efforts, we propose an efficient street-view image annotation pipeline that leverages the existing label maps of satellite view and the transformation relations between different views (satellite, panorama, and mono-view). With the new OmniCity dataset, we provide benchmarks for a variety of tasks including building footprint extraction, height estimation, and building plane/instance/fine-grained segmentation. Compared with the existing multi-level and multi-view benchmarks, OmniCity contains a larger number of images with richer annotation types and more views, provides more benchmark results of state-of-the-art models, and introduces a novel task for fine-grained building instance segmentation on street-level panorama images. Moreover, OmniCity provides new problem settings for existing tasks, such as cross-view image matching, synthesis, segmentation, detection, etc., and facilitates the developing of new methods for large-scale city understanding, reconstruction, and simulation. The OmniCity dataset as well as the benchmarks will be available at


page 2

page 5

page 7

page 8

page 9

page 10

page 11

page 15


UrbanBIS: a Large-scale Benchmark for Fine-grained Urban Building Instance Segmentation

We present the UrbanBIS benchmark for large-scale 3D urban understanding...

Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization

The goal of cross-view image based geo-localization is to determine the ...

Material Segmentation of Multi-View Satellite Imagery

Material recognition methods use image context and local cues for pixel-...

Holistic Multi-View Building Analysis in the Wild with Projection Pooling

We address six different classification tasks related to fine-grained bu...

Semi-supervised Learning from Street-View Images and OpenStreetMap for Automatic Building Height Estimation

Accurate building height estimation is key to the automatic derivation o...

Part-level Car Parsing and Reconstruction from Single Street View

In this paper, we make the first attempt to build a framework to simulta...

Recurrent Aggregation Learning for Multi-View Echocardiographic Sequences Segmentation

Multi-view echocardiographic sequences segmentation is crucial for clini...

Please sign up or login with your details

Forgot password? Click here to reset