Understanding Model Calibration - A gentle & visual introduction to classic calibration and the expected calibration error + alternative definitions of calibration
Maja Pavlovic will be presenting and discussing the the works surrounding model calibration.
Abstract
To be considered reliable, a model must be calibrated so that its confidence in each decision closely reflects its true outcome. In this blogpost we’ll take a look at the most commonly used definition for calibration and then dive into the most popular evaluation measure for model calibration. We’ll then cover some of the drawbacks of this measure and how these surfaced the need for additional notions of calibration, which require their own new evaluation measures. This post is not intended to be an in-depth dissection of all works on calibration, nor does it focus on how to calibrate models. Instead, it is meant to provide a gentle introduction to the different notions and their evaluation measures as well as to re-highlight some issues with a measure that is still widely used to evaluate calibration.
Updates
Since this talk was presented it the work was turn into a blog post (Pavlovic, 2025) and Won 🏆Best Blog Post🏆 at ICLR 2025.
References
Enjoy Reading This Article?
Here are some more articles you might like to read next: