连续型变量的预测误差度量方法

本文总结讨论了针对连续型、单变量的预测误差度量方法

Let $Y_t$ denote the observation at time $t$ and $F_t$ denote the forecast of $Y_t$. Then define the forecast error $e_t=Y_t-F_t$.

Scale-Dependent Measures

  • These are useful when comparing different methods applied to the same set of data, but should not be used.
Measure Acronym Definition Feature
Mean Square Error MSE $mean(e_t^2)$
Root Mean Square Error RMSE $\sqrt{MSE}$ Often, the RMSE is preferred to the MSE as it is on the same scale as the data
Mean Absolute Error MAE $mean(abs(e_t))$
Median Absolute Error MdAE $median(e_t)$

Measures Based on Percentage Errors

  • The percentage error is given by $p_t=100e_t/Y_t$. Percentage errors have the advantage of being scale-independent, and so are frequently used to compare forecast performance across different data sets.
  • These measures have the disadvantage of being infinite or undefined if $Y_t=0$ for any $t$ in the period of interest, and having an extremely skewed distribution when any value of $Y_t$ is close to zero.
Measure Acronym Definition Feature
Mean Absolute Percentage Error MAPE $mean(abs(p_t))$ MAPE is often substantially larger than the MdAPE due to the skewed distribution when $Y_t$ is close to zero
Median Absolute Percentage Error MdAPE $median(abs(p_t))$
Root Mean Square Percentage Error RMSPE $\sqrt{mean(p_t^2)}$
Root Median Square Percentage Error RMdSPE $\sqrt{median(p_t^2)}$
  • The MAPE and MdAPE also have the disadvantage that they put a heavier penalty on positive errors than on negative errors. This observation led to the use of the so-called symmetric measures.
  • The problems arising from small values of $Y_t$ may be less severe for sMAPE and sMdAPE. However, even there if $Y_t$ is close to zero, $F_t$ is also likely to be close to zero.
  • Measures based on percentage errors are often highly skewed, and therefore transformations (such as logarithms) can make them more stable.
Measure Acronym Definition
Symmetric Mean Absolute Percentage Error sMAPE $mean(200*abs(Y_t-F_t)/(Y_t+F_t))$
Symmetric Median Absolute Percentage Error sMdAPE $median(200*abs(Y_t-F_t)/(Y_t+F_t))$

Measures Based on Relative Errors

  • An alternative way of scaling is to divide each error by the error obtained using another standard method of forecasting.
  • Let $r_t = e_t / e_t^*$ denote the relative error.
  • where $e_t^*$ is the forecast error obtained from the benchmark method.
Measure Acronym Definition
Mean Relative Absolute Error MRAE $mean(abs(r_t))$
Median Relative Absolute Error MdRAE $median(abs(r_t))$
Geometric Mean Relative Absoluate Error GMRAE $gmean(abs(r_t))$

Relative Measures

  • Rather than use relative errors, one can use relative measures.
  • For example, let $MAE_b$ denote the MAE from the benchmark method.
  • Then, a relative $MAE$ is given by $RelMAE = MAE/MAE_b$. Similar measures can be defined using RMSEs, MdAEs, MAPEs, etc.
  • When $RelMAE < 1$, the proposed method is better than the benchmark method, and when $RelMAE > 1$, the proposed method is worse than the benchmark method.

Percent Better

  • A related approach is to use the percentage of forecasts for which a given method is more accurate than the benchmark method. This is often known as Percent Better and can be expressed as $PB(MAE)=100mean(I(MAE<MAE_b))$

Weighted Measures

It is reasonable to assume that every prediction should not be treated equally.

  • For instance, we can assign weights in a way that the higher the weight, the higher importance we are placing on more recent data.
  • The weighted Mean Absolute Error for a recommender system can be computed as following, where
    • $U$ represents the number of users;
    • $N_i$ , the number of items predicted for the $i^{th}$ user;
    • $r_{i,j}$, the rating given by the $i^{th}$ user to the item $I_j$;
    • $p_{i,j}$, the rating predicted by the model;
    • $w_{i,j}$ represents the weight associated to this prediction.

There Is Also Another Error Metric ?
$$WAPE = 100 \times \frac{sum(abs(Y_t-F_t))}{sum(Y_t)}$$


Scaled Errors

  • By scaling the error based on the in-sample MAE from the naive (random walk) forecast method. Thus, a scaled error is defined as following, which is clearly independent of the scale of the data.
  • A scaled error is less than one if it arises from a better forecast than the average one-step naive forecast computed in-sample. Conversely, it is greater than one if the forecast is worse than the average one-step naive forecast computed in-sample.
  • The Mean Absolute Scaled Error is simply
    $$MASE=mean(|q_t|)$$

  • Related measures such as Root Mean Squared Scaled Error (RMSSE) and Median Absolute Scaled
    Error (MdASE) can be defined analogously.

  • Of these measures, we prefer MASE as it is less sensitive to outliers and more easily interpreted than RMSSE, and less variable on small samples than MdASE.

Appendix

  • MAPE: Mean Absolute Percentage Error, where $A$ is actual value and $F$ is forecast value.
    $$MAPE = \frac{100}{n}\sum_{t=1}^n \left | \frac{A_t-F_t}{A_t} \right |$$

  • RMSPE: Root Mean Square Percentage Error
    $$RMSPE = \sqrt {\frac{1}{n}\sum_{t=1}^n (\frac{A_t-F_t}{A_t})^2}$$


Reference