α-Information-theoretic Privacy Watchdog and Optimal Privatization Scheme
This paper proposes an α-lift measure for data privacy and determines the optimal privatization scheme that minimizes the α-lift in the watchdog method. To release data X that is correlated with sensitive information S, the ratio l(s,x) = p(s|x)/p(s) denotes the `lift' of the posterior belief on S and quantifies data privacy. The α-lift is proposed as the L_α-norm of the lift: ℓ_α(x) = (·,x) _α = (E[l(S,x)^α])^1/α. This is a tunable measure: When α < ∞, each lift is weighted by its likelihood of appearing in the dataset (w.r.t. the marginal probability p(s)); For α = ∞, α-lift reduces to the existing maximum lift. To generate the sanitized data Y, we adopt the privacy watchdog method using α-lift: Obtain 𝒳_ϵ containing all x's such that ℓ_α(x) > e^ϵ; Apply the randomization r(y|x) to all x ∈𝒳_ϵ, while all other x ∈𝒳∖𝒳_ϵ are published directly. For the resulting α-lift ℓ_α(y), it is shown that the Sibson mutual information I_α^S(S;Y) is proportional to E[ ℓ_α(y)]. We further define a stronger measure I̅_α^S(S;Y) using the worst-case α-lift: max_yℓ_α(y). We prove that the optimal randomization r^*(y|x) that minimizes both I_α^S(S;Y) and I̅_α^S(S;Y) is X-invariant, i.e., r^*(y|x) = R(y), ∀ x∈𝒳_ϵ for any probability distribution R over y ∈𝒳_ϵ. Numerical experiments show that α-lift can provide flexibility in the privacy-utility tradeoff.
READ FULL TEXT