Bias Reduction via End-to-End Shift Learning: Application to Citizen Science

11/01/2018
by   Di Chen, et al.
0

Citizen science projects are successful at gathering rich datasets for various applications. Nevertheless, the data collected by the citizen scientists are often biased, more aligned with the citizens' preferences rather than scientific objectives. We propose the Shift Compensation Network (SCN), an end-to-end learning scheme which learns the shift from the scientific objectives to the biased data, while compensating the shift by re-weighting the training data. Applied to bird observational data from the citizen science project eBird, we demonstrate how SCN quantifies the data distribution shift as well as outperforms supervised learning models that do not address the data bias. Compared with other competing models in the context of covariate shift, we further demonstrate the advantage of SCN in both the effectiveness and the capability of handling massive high-dimensional data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2022

Data+Shift: Supporting visual investigation of data distribution shifts by data scientists

Machine learning on data streams is increasingly more present in multipl...
research
06/08/2020

Rethinking Importance Weighting for Deep Learning under Distribution Shift

Under distribution shift (DS) where the training data distribution diffe...
research
09/09/2022

Correcting inferences for volunteer-collected data with geospatial sampling bias

Citizen science projects in which volunteers collect data are increasing...
research
12/28/2017

Kernel Robust Bias-Aware Prediction under Covariate Shift

Under covariate shift, training (source) data and testing (target) data ...
research
06/21/2021

Stratified Learning: a general-purpose statistical method for improved learning under Covariate Shift

Covariate shift arises when the labelled training (source) data is not r...
research
02/17/2019

Semiparametric correction for endogenous truncation bias with Vox Populi based participation decision

We synthesize the knowledge present in various scientific disciplines fo...

Please sign up or login with your details

Forgot password? Click here to reset