UPDExplainer: an Interpretable Transformer-based Framework for Urban Physical Disorder Detection Using Street View Imagery

by   Chuanbo Hu, et al.

Urban Physical Disorder (UPD), such as old or abandoned buildings, broken sidewalks, litter, and graffiti, has a negative impact on residents' quality of life. They can also increase crime rates, cause social disorder, and pose a public health risk. Currently, there is a lack of efficient and reliable methods for detecting and understanding UPD. To bridge this gap, we propose UPDExplainer, an interpretable transformer-based framework for UPD detection. We first develop a UPD detection model based on the Swin Transformer architecture, which leverages readily accessible street view images to learn discriminative representations. In order to provide clear and comprehensible evidence and analysis, we subsequently introduce a UPD factor identification and ranking module that combines visual explanation maps with semantic segmentation maps. This novel integrated approach enables us to identify the exact objects within street view images that are responsible for physical disorders and gain insights into the underlying causes. Experimental results on the re-annotated Place Pulse 2.0 dataset demonstrate promising detection performance of the proposed method, with an accuracy of 79.9 comprehensive evaluation of the method's ranking performance, we report the mean Average Precision (mAP), R-Precision (RPrec), and Normalized Discounted Cumulative Gain (NDCG), with success rates of 75.51 respectively. We also present a case study of detecting and ranking physical disorders in the southern region of downtown Los Angeles, California, to demonstrate the practicality and effectiveness of our framework.


page 4

page 10

page 19

page 20

page 22

page 26

page 27


TMBuD: A dataset for urban scene building detection

Building recognition and 3D reconstruction of human made structures in u...

The 'Paris-end' of town? Urban typology through machine learning

The confluence of recent advances in availability of geospatial informat...

Layered Interpretation of Street View Images

We propose a layered street view model to encode both depth and semantic...

Knowledge-infused Contrastive Learning for Urban Imagery-based Socioeconomic Prediction

Monitoring sustainable development goals requires accurate and timely so...

Automatic Signboard Detection from Natural Scene Image in Context of Bangladesh Google Street View

Automatic signboard region detection is the first step of information ex...

Automatic Quantification and Visualization of Street Trees

Assessing the number of street trees is essential for evaluating urban g...

Quantifying the presence of graffiti in urban environments

Graffiti is a common phenomenon in urban scenarios. Differently from urb...

Please sign up or login with your details

Forgot password? Click here to reset