Methods to Measure Forecast Error


The first rule of forecasting is that all forecasts are either wrong or lucky. However failing to learn from when your forecasting is wrong or lucky makes it a lot less likely forecasting accuracy will improve over time. The first and most beneficial purpose of accuracy analysis is to learn from your mistakes after all you can’t manage what you don’t measure.
As with so many areas of forecasting, there is no single ‘best’ measure that can be used to describe forecasting accuracy. For typical operational forecasting a combination of MAPE, FVA and Exception Analysis are the dream ticket but I will come onto why a little later in this article.

  • Mean Percentage Error (MPE)
  • Error Total (ET)
  • Exceptions Analysis

Each of these measures have their pro’s & con’s but context is everything especially regarding which measure should be used for what situation, highly dependant on what you are your forecasting. FVA and Exceptions Analysis as methods are slightly different from the first 6 above and I will go into a little more depth later on in this article on when and how to use. 
The overall accuracy of any forecasting method, no matter which method you use, is determined by comparing the forecasted values with actual values. However to help determine which of the first 6 methods fits best for your situation it is important to first understand the two main types of error measurement:
Error Sign – In simple terms this error type helps determine whether you want to treat Positive and Negative Forecast Error the same or different i.e. will it make any difference to you if Forecast is more than Actual or less than Actual. Usually in operational planning setting’s both types of error can be equally harmful, however if you were forecasting for say perishable products then you would always prefer the Forecast to be less than Actual as surplus production is as good as loss. 

Error Spread – this error type helps determine whether it is important that the forecast error is concentrated into a few points or spread evenly across many e.g. do you mind if the forecast goes horribly wrong on a few points as long it is accurate for the entire horizon. Again shelf life of the item plays a big part here. For example a live channel customer contact (on say Phone or Chat) has a very short shelf life (you have until the customer abandon their attempt) whilst answering a customer email has a longer shelf life which means an email unanswered in one period can be effectively put into use in subsequent periods and thus it is less impactful if the forecast goes wrong in one period as long as we make up for that in accuracy across a broader horizon. 

Here is a quick reference table that shows which Forecasting Accuracy measurement weighting towards Error Sign and Error spread:
As mentioned already for operational workload forecasting either a positive or negative “Error Sign” is usually equally harmful, with either a positive error or a negative error either resulting in under or over staffing. As a result, often the three most popular accuracy methods tend to be Mean Absolute Deviation (MAD), Mean Squared Error (MSE) and/or Mean Absolute Percent Error (MAPE).
However, a common problem for both MAD and MSE is that their values depend on the magnitude of the item being forecast. For example, if the forecast item is measured in thousands, the MAD and MSE results can be very large – not so good for typical operational planning workload forecasting unless your organisation is huge. So this leaves us with MAPE. 
MAPE or WMAPE
Given the limitations of MAD and MSE this logically take us to MAPE. MAPE in its traditional form is computed as the average of the absolute difference between the forecasted and actual values and is expressed as a percentage of the actual values. MAPE is perhaps also the easiest measure to interpret and remains undistorted by any single large value. 

However, a major difficulty that arises with MAPE is that if there is any instance where the base in any individual percent error calculation is zero, the result cannot be calculated. This is often referred to as the divide by zero problem. Various workarounds have been used to deal with this issue, but none of them are mathematically correct. Perhaps the biggest problem arises when MAPE is used to assess the historical errors associated with different forecasting models with a view to selecting the best model. Thus, MAPE is totally unsuitable for assessing in this way any item with an intermittent demand pattern.

Also when calculating the average MAPE for a number of time series, you may encounter a problem: a few of the series that have a very high MAPE might distort a comparison between the average MAPE of a time series fitted with one method compared to the average MAPE when using another method.
The disadvantage of MAPE is immediately clear from the above example: a large percentage error for a small actual can cause a large MAPE. In this example, the result for the last day explains more than half of the MAPE. 

In order to avoid this problem, other measures have been defined, for example the SMAPE (symmetrical MAPE), weighted absolute percentage error (WAPE), real aggregated percentage error, and relative measure of accuracy (ROMA).
My personal favourite is WAPE mainly because of simplicity and ease to calculate. There is a very simple way to calculate WMAPE. This involves adding together the absolute errors at the detailed level, then calculating the total of the errors as a percentage of total volume.  This method of calculation leads to the additional benefit that it is robust to individual instances when the base is zero, thus overcoming the divide by zero problem that often occurs with MAPE.
WMAPE is a highly useful measure and is becoming increasingly popular both in corporate KPIs and for operational use. It is easily calculated and gives a concise forecast accuracy measurement that can be used to summarise performance at any detailed level across any grouping of products and/or time periods. If a measure of accuracy required this is calculated as 100 – WMAPE.  
Forecast Value Added (FVA)
WMAPE will tell you the size of your forecast error which of course is very important, however what it won’t tell you is how efficient you are forecasting, help you understand the drivers or underlying true variability, what should be the minimum error rate, or even whether the different methods and models you are using are making the forecast better or worse. 

To determine this, I advise you use a very simple process called Forecast Value Added (FVA). It requires a little extra effort upfront but in the long run can really add accuracy value and reduce forecasting man-hour cost by helping you to avoid pointless forecasting processes. It basically uses the simplest, least labour-intensive method of forecasting (namely a “naïve” forecast) to benchmark the forecasting accuracy at each of the stages in your current process. For example how much accuracy is being added by causal factors and is the leadership review adding value or biased views?
The above diagram is from a typical forecasting process, by running FVA you are able to answer the following questions;
  • Are all the stages in your forecasting process actually adding accuracy?
  • Is the effort expended at each stage actually worth it?
  • Where in the process are your weak points?
Exception Analysis
Summary measurement such as WMAPE are useful for tracking accuracy over time. However, exceptions analysis aims to identify and explain the reasons for the biggest / most expensive forecast errors, providing opportunity to learn from errors and potentially apply the lessons of experience to future forecasts.
The whole point of measuring the accuracy of your forecast is to improve it, and the only way I know of doing this is to try and understand why you have a gap.
It is therefore important you also include in your method a process for rapidly identifying the exceptions – those big deviations that caused the most problems and ask yourself the simple question could the causes have been anticipated? If so, you now have clearly identified that better information or interpretation in this space will improve future forecasts. A very simple high-level process to follow for exception analysis is:
1.    Exception Analysis Preparation– Define rules that will be used to identify and classify exceptions.
2.    Mining Phase. Apply algorithms to the data to identify exceptions based on pre-defined rules.
3.    Research exceptions – look for supporting information on cause of these exceptions.
4.    Submit changes to forecast – If the research changes the forecast and/or resolves the exception.
For example, an influx of customer contacts in the morning throughout the whole month will likely be built into the next forecast if you are using any type of historical method. However, knowing why this has happened informs you whether you should be including/excluding/smoothing this data for future forecasts. If the influx of calls was a result of a one-off TV advert being played during the mornings of that month only, would you want to include this in future forecasts?

Related Articles

Responses

Your email address will not be published. Required fields are marked *

  1. Thanks for a great article Doug. I strongly recommend your approach to all workforce planners.

    Many forecasters are measured on what is really just the statistically likely result. That's often no better than what good forecasting software can produce without human intervention. This approach of focusing on the value-add component recognises and encourages the real strengths of forecasters.

  2. Thanks Stuart,

    I am a true believer that the forecasting function (especially in large organisations) should be a dedicated a separate position. However I suspect many enterprises don't see the value in having a dedicated forecaster for the very reason you state.

  3. Agreed – great article. Aside from promoting the importance of having dedicated forecasting staff, you've done a fine job of punctuating the need for that staff to have a sharp acumen for the work. Unfortunately, some forecasting resources (particularly "home-grown" resources) simply do not possess such a full compliment of skills. If you would be so kind, please share the SAS white paper. I'm eager to see how my team and network of WFM colleagues can benefit from the information. Once again, kudos.

  4. Thanks for the feedback, I have sent out a connection on LinkedIn, perhaps if you can reply with your email address, I would be happy to send you a copy – Doug

  5. Like the article, Doug, specifically because I was just asked to look into this in January! My math and methodology was pretty much the same. First looking into natural deviation within each call type by DOW, seasonal time blocks, and removing holiday impacted days. Then getting the variance for my forecast to actuals. The one part I got hung up on is I'm not sure where to set my Goals, is it 50% of the natural variation? At a daily level, I am much tighter than 50% of the natural variation (probably cut it closer to 25%) but at an interval level is it "safe" to set my goals at the natural variation or at some proportion of it? As a business we sometimes struggle even within very small variations so we are looking to fix that more with our staffing flexibility than getting forecast variances to 1%, but I do love to attain perfection so I want to set some goals for myself! Any ideas?

  6. Katie, really sorry for taking so long to reply. I only just come across your comment. Its really great to hear you are using similar methodology. You certainly ask a tough question….

    I think there are cases where no matter what you do you will never get to the accuracy you want. But measuring and knowing where you stand can still be very valuable, and you are bang on to look at your staffing flexibility (I presume you refer to short notice tactical flexibility) as a source to overcome variance.

    I don't think there is any one target that will suit all situations, so I am going to sit on the fence and say it depends.. Depends on a number of factors, but like all target set at first to something a little stretching but achievable. If you are not already doing so I highly recommend you map your whole forecasting process and of course as highlighted above use FVA to measure each of the elements contribution to your overall accuracy. If you meet it consistently then stretch again. This way you make sustainable improvements.

  7. This is a great article, I have been exploring a combination of MAPE and MAD along with other variation measures such as looking at the number of errors above and below the forecast over a period of the time to measure overall variation from the volume. The thing I struggle to get a grip on most is the cost of return for improvements in forecast accuracy at MAPE Level or any other level. The cause for my questions is that the loss of forecast accuracy will either impact service or labour costs which do not always align and can be influenced on other actions along the planning journey. Any Thoughts or related articles would be appreciated.

  8. to Anonymous…

    No easy answer to your question either, but then you would not be asking it here if there was.

    One thing I encourage all planners to a grip on is their Cost per call & agent and if possible revenue per call & agent. These are not easy metrics to uncover and you will need to engage both your operations and finance dept continuously to keep them up to date, but once uncovered can make a planners life much easier. In addition they are very very valuable metrics for decision makers from all parts of the business.

    Forecasting accuracy alone does not have a direct effect on either service or cost/revenue which is why you are struggling to show benefit for improvements in accuracy. However we all know that without accuracy WFM fails…. So how to show benefit? Well armed with the above metrics you are half way there, the next step is to see what effect it has on your WFM process later on. Does it reduce the number of FTE you need, does it improve service and revenue… ect ect all depends upon your unique situation.

    Hope this helps and happy to discuss further on your particular case if you so wish

  9. Hi Doug — just came across your blog. Nice recap of the FVA approach for evaluating forecasting performance and identifying the weak points in the process.

    FVA has been widely adopted in consumer products and manufacturing. I was glad to see your application of the approach in workforce management.

    If any of your audience is interested in more on FVA and related forecasting topics, I do a weekly blog The Business Forecasting Deal at blogs.sas.com/content/forecasting. There is also a new article ("FVA: A Reality Check on Forecasting Practices") to appear in the Spring 2013 issue of Foresight.
    –Mike Gilliland, SAS

Need Help? Chat with us