Probabilistic supervised learning

01/02/2018
by   Frithjof Gressmann, et al.
0

Predictive modelling and supervised learning are central to modern data science. With predictions from an ever-expanding number of supervised black-box strategies - e.g., kernel methods, random forests, deep learning aka neural networks - being employed as a basis for decision making processes, it is crucial to understand the statistical uncertainty associated with these predictions. As a general means to approach the issue, we present an overarching framework for black-box prediction strategies that not only predict the target but also their own predictions' uncertainty. Moreover, the framework allows for fair assessment and comparison of disparate prediction strategies. For this, we formally consider strategies capable of predicting full distributions from feature variables, so-called probabilistic supervised learning strategies. Our work draws from prior work including Bayesian statistics, information theory, and modern supervised machine learning, and in a novel synthesis leads to (a) new theoretical insights such as a probabilistic bias-variance decomposition and an entropic formulation of prediction, as well as to (b) new algorithms and meta-algorithms, such as composite prediction strategies, probabilistic boosting and bagging, and a probabilistic predictive independence test. Our black-box formulation also leads (c) to a new modular interface view on probabilistic supervised learning and a modelling workflow API design, which we have implemented in the newly released skpro machine learning toolbox, extending the familiar modelling interface and meta-modelling functionality of sklearn. The skpro package provides interfaces for construction, composition, and tuning of probabilistic supervised learning strategies, together with orchestration features for validation and comparison of any such strategy - be it frequentist, Bayesian, or other.

READ FULL TEXT
research
11/16/2017

Predictive Independence Testing, Predictive Conditional Independence Testing, and Predictive Graphical Modelling

Testing (conditional) independence of multivariate random variables is a...
research
01/11/2019

Machine Learning Automation Toolbox (MLaut)

In this paper we present MLaut (Machine Learning AUtomation Toolbox) for...
research
12/10/2017

Sensitivity Analysis for Predictive Uncertainty in Bayesian Neural Networks

We derive a novel sensitivity analysis of input variables for predictive...
research
09/15/2020

Demand Forecasting of individual Probability Density Functions with Machine Learning

Demand forecasting is a central component for many aspects of supply cha...
research
02/16/2020

Active Bayesian Assessment for Black-Box Classifiers

Recent advances in machine learning have led to increased deployment of ...
research
01/27/2017

Modelling Competitive Sports: Bradley-Terry-Élő Models for Supervised and On-Line Learning of Paired Competition Outcomes

Prediction and modelling of competitive sports outcomes has received muc...
research
03/03/2021

Parsimonious Inference

Bayesian inference provides a uniquely rigorous approach to obtain princ...

Please sign up or login with your details

Forgot password? Click here to reset