| Evaluating Predictive Uncertainty Challenge | |||||||||||||||||||||||||||||||||||||||||||
|
This challenge is supported by the PASCAL Network of
Excellence
NEW! The Challenge was be discussed at the PASCAL Challenges Workhsop in Southampton on April 11 2005 Some ideas from the Workshop:
Important dates
Regression winners
Classification winners
Other dataset winners
What is it about?The goal of this challenge is to evaluate probabilistic methods for regression and for classification problems. A number of regression classification tasks are proposed. Training data (input-output pairs) are given, and the contestants are asked to predict the outputs associated to a set of validation and test inputs. These predictions are probabilistic and take the form of predictive distributions. The performance of the competing algorithms will be evaluated both with traditional losses that only take into account "point predictions" and with losses that evaluate the quality of the probabilistic predictions.How do I participate?
BackgroundIn many practical applications of Machine Learning there is a need for estimates of the accuracy of the predictions (or model uncertainty) that we will refer to as ``predictive uncertainties''. One example application where these are of crucial importance is active learning, where they are used to select the next training example which will bring most information. Predictive uncertainties come usually in the form of error-bars, confidence intervals, predictive distributions or posterior distributions. However, there is an apparent lack of consensus on how to produce good estimates of predictive uncertainty in the Machine Learning (or Statistical Learning) community, and of ways of evaluating them in the first place. There can be two ways of interpreting predictive uncertainties. One way is as estimates of the error made when predicting at a given point, while the second one could be as estimates of the overall generalization error (average error on future unseen examples). At this point we do not intend to exclude any of these, but rather aim at investigating both.From the theoretical point of view, there are essentially two abstract ways of modelling uncertainty in a Machine Learning problem. An extreme way of presenting these approaches is to say that one of them consists in considering fixed training datasets and random functions and the other in placing the randomness in the training data and considering fixed ``true'' functions. The first approach is commonly used by the Bayesian community, and the second one by the Statistical Learning community. While predictive uncertainties arise naturally under the Bayesian paradigm, in Statistical Learning there is no consensus in how to define predictive uncertainties. The Bayesian scheme starts from a prior, and is often attacked with questions like: ``what if the prior was incorrect''. Indeed, priors that do not reflect one's actual beliefs are often used for convenience reasons, such as analytic, or computational cost. Convenience priors might have an undesirable impact on the resulting predictive uncertainties. On the other hand, the Statistical Learning Theory community takes a completely different approach, and does not consider predictive distributions but instead focuses on tail bounds (confidence intervals). For example, assuming only that the data for a classification task is sampled at random and iid, bounds can be proved on the generalisation error (also called out-of-sample error or test-error). This theory is often attacked for the bounds being too loose to be of much value. We feel that there is a need for formulating ways of evaluation of the quality of the predictive uncertainties. We do propose in this challenge measures of this quality. | ||||||||||||||||||||||||||||||||||||||||||