R/summary.R
summary.reliabilitydiag.Rd
An object of class reliabilitydiag
contains the observations, the
original forecasts, and recalibrated forecasts given by isotonic regression.
The function summary.reliabilitydiag
calculates quantitative measures
of predictive performance, miscalibration, discrimination,
and uncertainty, for each of the prediction methods in relation to their
recalibrated version.
# S3 method for reliabilitydiag
summary(object, ..., score = "brier")
an object inheriting from the class 'reliabilitydiag'
.
further arguments to be passed to or from methods.
currently only "brier" or a vectorized scoring function,
that is, function(observation, prediction)
.
A 'summary.reliability'
object, which is also a
tibble (see tibble::tibble()
) with columns:
forecast | the name of the prediction method. |
mean_score | the mean score of the original forecast values. |
miscalibration | a measure of miscalibration (how reliable is the prediction method?), smaller is better. |
discrimination | a measure of discrimination (how variable are the recalibrated predictions?), larger is better. |
uncertainty | the mean score of a constant prediction at the value of the average observation. |
Predictive performance is measured by the mean score of the original forecast values, denoted by \(S\).
Uncertainty, denoted by \(UNC\), is the mean score of a constant prediction at the value of the average observation. It is the highest possible mean score of a calibrated prediction method.
Discrimination, denoted by \(DSC\), is \(UNC\) minus the mean score of the PAV-recalibrated forecast values. A small value indicates a low information content (low signal) in the original forecast values.
Miscalibration, denoted by \(MCB\), is \(S\) minus the mean score of the PAV-recalibrated forecast values. A high value indicates that predictive performance of the prediction method can be improved by recalibration.
These measures are related by the following equation, $$S = MCB - DSC + UNC.$$ Score decompositions of this type have been studied extensively, but the optimality of the PAV solution ensures that \(MCB\) is nonnegative, regardless of the chosen (admissible) scoring function. This is a unique property achieved by choosing PAV-recalibration.
If deviating from the Brier score as performance metric, make sure to choose a proper scoring rule for binary events, or equivalently, a scoring function with outcome space {0, 1} that is consistent for the expectation functional.
data("precip_Niamey_2016", package = "reliabilitydiag")
r <- reliabilitydiag(
precip_Niamey_2016[c("Logistic", "EMOS", "ENS", "EPC")],
y = precip_Niamey_2016$obs,
region.level = NA
)
summary(r)
#> 'brier' score decomposition (see also ?summary.reliabilitydiag)
#> # A tibble: 4 × 5
#> forecast mean_score miscalibration discrimination uncertainty
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Logistic 0.206 0.0171 0.0555 0.244
#> 2 EMOS 0.232 0.0183 0.0305 0.244
#> 3 ENS 0.266 0.0661 0.0441 0.244
#> 4 EPC 0.234 0.0223 0.0323 0.244
summary(r, score = function(y, x) (x - y)^2)
#> 'function(y, x) (x - y)^2' score decomposition (see also ?summary.reliabilitydiag)
#> # A tibble: 4 × 5
#> forecast mean_score miscalibration discrimination uncertainty
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Logistic 0.206 0.0171 0.0555 0.244
#> 2 EMOS 0.232 0.0183 0.0305 0.244
#> 3 ENS 0.266 0.0661 0.0441 0.244
#> 4 EPC 0.234 0.0223 0.0323 0.244