The forecasts X01 to X10 are generated in such a way that their discrimination ability is neatly decreasing. In addition, X01 and X06 are "calibrated", X02 and X07 are "underconfident", X03 and X08 are "overconfident", X04 and X09 exhibit "negative bias", and X05 and X10 exhibit "positive bias".
Format
A data frame with 1,000 rows and 11 columns, generated as described in 'Details':
- y
observations
- X01
forecasts, full information, calibrated: \(a = 1\), \(b = 1\)
- X02
forecasts, less information than X01, underconfident: \(a = 1/4\), \(b = 1/4\)
- X03
forecasts, less information than X02, overconfident: \(a = 4\), \(b = 4\)
- X04
forecasts, less information than X03, negative bias: \(a = 5/3\), \(b = 3/5\)
- X05
forecasts, less information than X04, positive bias: \(a = 3/5\), \(b = 5/3\)
- X06
forecasts, less information than X05, calibrated: \(a = 1\), \(b = 1\)
- X07
forecasts, less information than X06, underconfident: \(a = 1/4\), \(b = 1/4\)
- X08
forecasts, less information than X07, overconfident: \(a = 4\), \(b = 4\)
- X09
forecasts, less information than X08, negative bias: \(a = 5/3\), \(b = 3/5\)
- X10
forecasts, least information, positive bias: \(a = 2/3\), \(b = 3/2\)
Details
The observations are generated from a Bernoulli distribution, where the success probability is determined by ten sources of information. That is, the probability is given by $$p = \Phi(\sum_{i = 1}^{10} Z_i),$$ where \(Z_i\), \(i = 1, ..., 10,\) are independent standard Gaussian random variables, and \(\Phi\) denotes the cumulative distribution function of the standard Gaussian distribution.
The corresponding forecasts are named in decreasing order of access to these
latent Gaussian variables (that is, information content). In a first step,
calibrated forecasts are generated by
\(p[j] = \Phi(\frac{1}{j}\sum_{i = j}^{10} Z_i)\).
Subsequently, these probabilities are perturbed to introduce miscalibration
using the cumulative distribution function \(F\) of the beta distribution, yielding
the final forecasts
$$X[j] = F(p[j]; a, b),$$
where \(a\) and \(b\) are the positive shape parameters (see pbeta()
).