ZaliQL: A SQL-Based Framework for Drawing Causal Inference from Big Data

by   Babak Salimi, et al.

Causal inference from observational data is a subject of active research and development in statistics and computer science. Many toolkits have been developed for this purpose that depends on statistical software. However, these toolkits do not scale to large datasets. In this paper we describe a suite of techniques for expressing causal inference tasks from observational data in SQL. This suite supports the state-of-the-art methods for causal inference and run at scale within a database engine. In addition, we introduce several optimization techniques that significantly speedup causal inference, both in the online and offline setting. We evaluate the quality and performance of our techniques by experiments of real datasets.


page 1

page 2

page 3

page 4


A Survey on Causal Inference

Causal inference is a critical research topic across many domains, such ...

Data science is science's second chance to get causal inference right: A classification of data science tasks

Causal inference from observational data is the goal of many health and ...

Online Causal Inference with Application to Near Real-Time Post-Market Vaccine Safety Surveillance

Streaming data routinely generated by mobile phones, social networks, e-...

Massively Parallel Causal Inference of Whole Brain Dynamics at Single Neuron Resolution

Empirical Dynamic Modeling (EDM) is a nonlinear time series causal infer...

Algorithms for Solving Nonlinear Binary Optimization Problems in Robust Causal Inference

Identifying cause-effect relation among variables is a key step in the d...

Causal Inference in Network Economics

Network economics is the study of a rich class of equilibrium problems t...

The tropical geometry of causal inference for extremes

Extreme value statistics is the max analogue of classical statistics, wh...

Please sign up or login with your details

Forgot password? Click here to reset