Tuple-Independent Representations of Infinite Probabilistic Databases

08/21/2020
by   Nofar Carmeli, et al.
0

Probabilistic databases (PDBs) are probability spaces over database instances. They provide a framework for handling uncertainty in databases, as occurs due to data integration, noisy data, data from unreliable sources or randomized processes. Most of the existing theory literature investigated finite, tuple-independent PDBs (TI-PDBs) where the occurrences of tuples are independent events. Only recently, Grohe and Lindner (PODS '19) introduced independence assumptions for PDBs beyond the finite domain assumption. In the finite, a major argument for discussing the theoretical properties of TI-PDBs is that they can be used to represent any finite PDB via views. This is no longer the case once the number of tuples is countably infinite. In this paper, we systematically study the representability of infinite PDBs in terms of TI-PDBs and the related block-independent disjoint PDBs. The central question is which infinite PDBs are representable as first-order views over tuple-independent PDBs. We give a necessary condition for the representability of PDBs and provide a sufficient criterion for representability in terms of the probability distribution of a PDB. With various examples, we explore the limits of our criteria. We show that conditioning on first order properties yields no additional power in terms of expressivity. Finally, we discuss the relation between purely logical and arithmetic reasons for (non-)representability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/30/2020

Independence in Infinite Probabilistic Databases

Probabilistic databases (PDBs) model uncertainty in data. The current st...
research
07/02/2018

Probabilistic Databases with an Infinite Open-World Assumption

Probabilistic databases (PDBs) introduce uncertainty into relational dat...
research
11/30/2020

Standard Probabilistic Databases

Probabilistic databases (PDBs) model uncertainty in data in a quantitati...
research
04/14/2019

Infinite Probabilistic Databases

Probabilistic databases (PDBs) are used to model uncertainty in data in ...
research
06/20/2023

Finite and infinite weighted exchangeable sequences

Motivated by recent interests in predictive inference under distribution...
research
10/10/2022

Common Randomness Generation from Sources with Countable Alphabet

We study a standard two-source model for common randomness (CR) generati...
research
01/11/2020

Prediction with eventual almost sure guarantees

We study the problem of predicting the properties of a probabilistic mod...

Please sign up or login with your details

Forgot password? Click here to reset