A deep mixture density network for outlier-corrected interpolation of crowd-sourced weather data

by   Charlie Kirkwood, et al.

As the costs of sensors and associated IT infrastructure decreases - as exemplified by the Internet of Things - increasing volumes of observational data are becoming available for use by environmental scientists. However, as the number of available observation sites increases, so too does the opportunity for data quality issues to emerge, particularly given that many of these sensors do not have the benefit of official maintenance teams. To realise the value of crowd sourced 'Internet of Things' type observations for environmental modelling, we require approaches that can automate the detection of outliers during the data modelling process so that they do not contaminate the true distribution of the phenomena of interest. To this end, here we present a Bayesian deep learning approach for spatio-temporal modelling of environmental variables with automatic outlier detection. Our approach implements a Gaussian-uniform mixture density network whose dual purposes - modelling the phenomenon of interest, and learning to classify and ignore outliers - are achieved simultaneously, each by specifically designed branches of our neural network. For our example application, we use the Met Office's Weather Observation Website data, an archive of observations from around 1900 privately run and unofficial weather stations across the British Isles. Using data on surface air temperature, we demonstrate how our deep mixture model approach enables the modelling of a highly skilled spatio-temporal temperature distribution without contamination from spurious observations. We hope that adoption of our approach will help unlock the potential of incorporating a wider range of observation sources, including from crowd sourcing, into future environmental models.


page 3

page 6

page 12

page 13

page 14

page 15

page 16

page 17


Bayesian spatio-temporal models for stream networks

Spatio-temporal models are widely used in many research areas including ...

A general framework for estimating the spatio-temporal distribution of a species using multiple data types

Species distribution models (SDMs) are useful tools to help ecologists q...

Geospatial Analysis and Internet of Things in Environmental Informatics

Geospatial analysis offers large potential for better understanding, mod...

Prediction of fish location by combining fisheries data and sea bottom temperature forecasting

This paper combines fisheries dependent data and environmental data to b...

Recurrent Flow Networks: A Recurrent Latent Variable Model for Spatio-Temporal Density Modelling

When modelling real-valued sequences, a typical approach in current RNN ...

A Statistical Analysis of Noisy Crowdsourced Weather Data

Spatial prediction of weather-elements like temperature, precipitation, ...

On the Stochasticity of Reanalysis Outputs of 4D-Var

This work is motivated by the ECMWF CAMS reanalysis data, a valuable res...

Please sign up or login with your details

Forgot password? Click here to reset