Learners that Leak Little Information
We study learning algorithms that are restricted to revealing little information about their input sample. Various manifestations of this notion have been recently studied. A central theme in these works, and in ours, is that such algorithms generalize. We study a category of learning algorithms, which we term d-bit information learners. These are algorithms whose output conveys at most d bits of information on their input. We focus on the learning capacity of such algorithms: we prove generalization bounds with tight dependencies on the confidence and error parameters. We observe connections with well studied notions such as PAC-Bayes and differential privacy. For example, it is known that pure differentially private algorithms leak little information. We complement this fact with a separation between bounded information and pure differential privacy in the setting of proper learning, showing that differential privacy is strictly more restrictive. We also demonstrate limitations by exhibiting simple concept classes for which every (possibly randomized) empirical risk minimizer must leak a lot of information. On the other hand, we show that in the distribution-dependent setting every VC class has empirical risk minimizers that do not leak a lot of information.
READ FULL TEXT