Let $Y_t$ denote the observation at time $t$ and $F_t$ denote the forecast of $Y_t$. Then define the forecast error $e_t=Y_t-F_t$.
Scale-Dependent Measures
- These are useful when comparing different methods applied to the same set of data, but should not be used.
Measure | Acronym | Definition | Feature |
---|---|---|---|
Mean Square Error | MSE | $mean(e_t^2)$ | |
Root Mean Square Error | RMSE | $\sqrt{MSE}$ | Often, the RMSE is preferred to the MSE as it is on the same scale as the data |
Mean Absolute Error | MAE | $mean(abs(e_t))$ | |
Median Absolute Error | MdAE | $median(e_t)$ |
Measures Based on Percentage Errors
- The percentage error is given by $p_t=100e_t/Y_t$. Percentage errors have the advantage of being scale-independent, and so are frequently used to compare forecast performance across different data sets.
- These measures have the disadvantage of being infinite or undefined if $Y_t=0$ for any $t$ in the period of interest, and having an extremely skewed distribution when any value of $Y_t$ is close to zero.
Measure | Acronym | Definition | Feature |
---|---|---|---|
Mean Absolute Percentage Error | MAPE | $mean(abs(p_t))$ | MAPE is often substantially larger than the MdAPE due to the skewed distribution when $Y_t$ is close to zero |
Median Absolute Percentage Error | MdAPE | $median(abs(p_t))$ | |
Root Mean Square Percentage Error | RMSPE | $\sqrt{mean(p_t^2)}$ | |
Root Median Square Percentage Error | RMdSPE | $\sqrt{median(p_t^2)}$ |
- The MAPE and MdAPE also have the disadvantage that they put a heavier penalty on positive errors than on negative errors. This observation led to the use of the so-called symmetric measures.
- The problems arising from small values of $Y_t$ may be less severe for sMAPE and sMdAPE. However, even there if $Y_t$ is close to zero, $F_t$ is also likely to be close to zero.
- Measures based on percentage errors are often highly skewed, and therefore transformations (such as logarithms) can make them more stable.
Measure | Acronym | Definition |
---|---|---|
Symmetric Mean Absolute Percentage Error | sMAPE | $mean(200*abs(Y_t-F_t)/(Y_t+F_t))$ |
Symmetric Median Absolute Percentage Error | sMdAPE | $median(200*abs(Y_t-F_t)/(Y_t+F_t))$ |
Measures Based on Relative Errors
- An alternative way of scaling is to divide each error by the error obtained using another standard method of forecasting.
- Let $r_t = e_t / e_t^*$ denote the relative error.
- where $e_t^*$ is the forecast error obtained from the benchmark method.
Measure | Acronym | Definition |
---|---|---|
Mean Relative Absolute Error | MRAE | $mean(abs(r_t))$ |
Median Relative Absolute Error | MdRAE | $median(abs(r_t))$ |
Geometric Mean Relative Absoluate Error | GMRAE | $gmean(abs(r_t))$ |
Relative Measures
- Rather than use relative errors, one can use relative measures.
- For example, let $MAE_b$ denote the MAE from the benchmark method.
- Then, a relative $MAE$ is given by $RelMAE = MAE/MAE_b$. Similar measures can be defined using RMSEs, MdAEs, MAPEs, etc.
- When $RelMAE < 1$, the proposed method is better than the benchmark method, and when $RelMAE > 1$, the proposed method is worse than the benchmark method.
Percent Better
- A related approach is to use the percentage of forecasts for which a given method is more accurate than the benchmark method. This is often known as Percent Better and can be expressed as $PB(MAE)=100mean(I(MAE<MAE_b))$
Weighted Measures
It is reasonable to assume that every prediction should not be treated equally.
- For instance, we can assign weights in a way that the higher the weight, the higher importance we are placing on more recent data.
- The weighted Mean Absolute Error for a recommender system can be computed as following, where
- $U$ represents the number of users;
- $N_i$ , the number of items predicted for the $i^{th}$ user;
- $r_{i,j}$, the rating given by the $i^{th}$ user to the item $I_j$;
- $p_{i,j}$, the rating predicted by the model;
- $w_{i,j}$ represents the weight associated to this prediction.
There Is Also Another Error Metric ?
$$WAPE = 100 \times \frac{sum(abs(Y_t-F_t))}{sum(Y_t)}$$
Scaled Errors
- By scaling the error based on the in-sample MAE from the naive (random walk) forecast method. Thus, a scaled error is defined as following, which is clearly independent of the scale of the data.
- A scaled error is less than one if it arises from a better forecast than the average one-step naive forecast computed in-sample. Conversely, it is greater than one if the forecast is worse than the average one-step naive forecast computed in-sample.
The Mean Absolute Scaled Error is simply
$$MASE=mean(|q_t|)$$Related measures such as Root Mean Squared Scaled Error (RMSSE) and Median Absolute Scaled
Error (MdASE) can be defined analogously.- Of these measures, we prefer MASE as it is less sensitive to outliers and more easily interpreted than RMSSE, and less variable on small samples than MdASE.
Appendix
MAPE: Mean Absolute Percentage Error, where $A$ is actual value and $F$ is forecast value.
$$MAPE = \frac{100}{n}\sum_{t=1}^n \left | \frac{A_t-F_t}{A_t} \right |$$RMSPE: Root Mean Square Percentage Error
$$RMSPE = \sqrt {\frac{1}{n}\sum_{t=1}^n (\frac{A_t-F_t}{A_t})^2}$$
Reference
- Hyndman, R. J., & Koehler, A. B. (2006). Another look at measures of forecast accuracy. International journal of forecasting, 22(4), 679-688.
- Cleger-Tamayo, S., Fernández-Luna, J. M., & Huete, J. F. (2012, September). On the Use of Weighted Mean Absolute Error in Recommender Systems. In RUE@ RecSys (pp. 24-26).
- WMAPE?, W. (2017). What’s the gaps for the forecast error metrics: MAPE and WMAPE?. Stackoverflow.com. Retrieved 3 April 2017, from http://stackoverflow.com/questions/12994929/whats-the-gaps-for-the-forecast-error-metrics-mape-and-wmape