Multi-characteristic Subject Selection from Biased Datasets

12/18/2020
by   Tahereh Arabghalizi, et al.
0

Subject selection plays a critical role in experimental studies, especially ones with human subjects. Anecdotal evidence suggests that many such studies, done at or near university campus settings suffer from selection bias, i.e., the too-many-college-kids-as-subjects problem. Unfortunately, traditional sampling techniques, when applied over biased data, will typically return biased results. In this paper, we tackle the problem of multi-characteristic subject selection from biased datasets. We present a constrained optimization-based method that finds the best possible sampling fractions for the different population subgroups, based on the desired sampling fractions provided by the researcher running the subject selection.We perform an extensive experimental study, using a variety of real datasets. Our results show that our proposed method outperforms the baselines for all problem variations by up to 90

READ FULL TEXT
research
12/12/2019

Testing Independence under Biased Sampling

Testing for association or dependence between pairs of random variables ...
research
05/15/2019

Selection Bias Explorations and Debias Methods for Natural Language Sentence Matching Datasets

Natural Language Sentence Matching (NLSM) has gained substantial attenti...
research
12/07/2015

Learning population and subject-specific brain connectivity networks via Mixed Neighborhood Selection

In neuroimaging data analysis, Gaussian graphical models are often used ...
research
06/28/2019

Statistical Learning from Biased Training Samples

With the deluge of digitized information in the Big Data era, massive da...
research
09/18/2020

Chemical Property Prediction Under Experimental Biases

The ability to predict the chemical properties of compounds is crucial i...
research
01/12/2015

Tri-Subject Kinship Verification: Understanding the Core of A Family

One major challenge in computer vision is to go beyond the modeling of i...
research
09/28/2020

Why resampling outperforms reweighting for correcting sampling bias

A data set sampled from a certain population is biased if the subgroups ...

Please sign up or login with your details

Forgot password? Click here to reset