As SAP continue to enrich the functionality of Integrated Business Planning (IBP), in this blog I am going to describe the outlier detection and correction methods offered within IBP for demand (as at release 1608). I will not however be discussing which of these methods are the best, as from my experience I have found that the rules and methods vary depending on the organization’s perspectives and processes followed.
Why Outlier Detection and Correction?
An outlier can be defined as “an observation that lies outside the overall pattern of a distribution” (Moore and McCabe 1999). In the context of IBP for demand, the outlier is related to the sales history of a particular time bucket that differs from the other data points with a defined level of statistical significance. Often, the planners will be unable to explain the reason or root cause for such exceptions, for example the underlying information to distinguish between base demand and one-off lift is unavailable.
The key presumption for outlier correction is that, whatever the cause of the exception, it is a one-off event that should not be used to predict the level of demand in the future.
With the huge number of planning combinations that planners are expected to process, manually identifying and correcting outliers is challenging, if not impossible! Unless outliers in historical data are detected and corrected, the quality of the forecast that is generated will be poorer – garbage in, garbage out. So how does the functionality of IBP for demand help the planner to automatically identify and correct outliers?
Outlier Detection Methods Offered by IBP for Demand
In both SAP Advanced Planning and Optimisation (APO) and SAP Spare Parts Planning (SPP), the outlier correction cannot be run in isolation from forecast generation. This is where IBP for demand differs as it provides the flexibility to run outlier correction as part of the statistical forecast like in APO, or in a single pre-processing step to allow the potential for a manual review/alert prior to the execution of the statistical forecast.
I will now describe the different methods currently provided by IBP for demand to detect whether a data point should be considered as an outlier for the processing by one of the correction algorithms I will present in a later section.
Figure 1: Outlier Detection Method options in IBP for demand
- Variance Test
The variance test in IBP for demand evaluates the allowed upper and lower thresholds of the time series using the mean and standard deviation of the series. Any data point that either exceeds the upper threshold or falls below the lower threshold is detected as an outlier. The formulae for each are;
Upper threshold = Mean + (Multiplier * Standard Deviation)
Lower threshold = Mean – (Multiplier * Standard Deviation)
The multiplier – highlighted in red box in Figure 1 (like sigma factor in APO) is therefore the number of standard deviations either side of the mean to set the thresholds.
- Interquartile Range (IQR) Test
There are many blogs explaining the algorithm behinds IQR Test (also known as Box-and-Whisker Plot or Box Plot), but here’s my version:
Steps
Sort the data in ascending order and split the sorted series into two halves. (If an odd number of data points exist, ignore the median data point)
The median value of the first half is the first quartile (Q1), and the median value of the second half is third quartile (Q3).
The IQR is then the difference between Q3 and Q1.
Formula
IQR = Q3 – Q1
Upper threshold = Q3 + (Multiplier * IQR)
Lower threshold = Q1 – (Multiplier * IQR)
Any data point that exceeds the upper threshold or falls below the lower threshold is detected an Outlier.
A few observations;
- Whereas in APO or SPP we can set the sigma value between 0.01 and 9.99, the multiplier cannot be less than 1.
- In either APO or IBP for demand, the same sigma/multiplier value is used for both the upper threshold and lower threshold calculation. Note that some businesses prefer different values for the upper and lower threshold calculations.
- Unlike APO or SPP, IBP for demand does not offer the ex-post forecast value method of outlier detection.
- The periods boxed in red in the Figure 2 below are detected as outlier by IQR, and the periods that are colour shaded but not boxed are detected as outliers in both tests.
Figure 2: IQR and Variance Test Detection Results
- Neither in Fiori nor in Excel does IBP for demand provide a Box-and-Whisker Plot graphical representation. However, data analysts can create this chart themselves with Excel.
- IBP global configuration parameter HISTORY_MIN_VALUE (as in Figure 3) can be set to consider a minimum value if history is either NULL, or less than this minimum specified value. As per figure 3, if the actual history is less than 1 or NULL, then it will be considered as 1 in the outlier detection and correction methods. This configuration is planning area independent and the outlier detection/correction algorithms will consider this minimum value. Such a possibility doesn’t exist in APO or SPP. Is there a real business benefit of having this value set? I am not too sure if I am being honest.
Figure 3: HISTORY_MIN_VALUE configuration
- IBP for demand currently does not have the possibility to ignore leading 0 values in the time series which SAP APO offers (refer to Figures 4). So, if you have leading zeros, then you should use a forecast profile with the right number of historical periods. And change this profile every month (or could do this quarterly) until it reaches the same number of periods as per the maximum historical planning horizon. Yes, a painful exercise.
Figure 4: Ignore leading zeros in APO
Outlier Correction Methods Offered by IBP for Demand
I will now describe the methods available in IBP for demand to adjust those outliers identified in the detection step above. For reference, APO offers three methods of outlier correction;
Ex-post forecast value
Median value
Tolerance lanes (upper or lower threshold – see Figure 5)
Figure 5: Outlier correction to tolerance lane setting in APO
IBP for demand offers six different outlier correction methods to choose from (as in Figure 6):
Figure 6: Outlier correction methods in IBP for demand
Outlier Correction Method |
Outlier corrected value |
Correction with Mean |
Average of all data values |
Correction with Mean Excluding Outliers |
Average of all data values excluding all outliers. |
Correction with Tolerance |
Upper Threshold (if outlier exceeds the upper threshold of detection method) or Lower Threshold (if outlier falls below the lower threshold of detection method) |
Correction with Tolerance Excluding Outliers |
Newly evaluated value of Upper Threshold excluding all outliers (if outlier exceeds the upper threshold of detection method) or Newly evaluated value of Lower Threshold excluding all outliers (if outlier falls below the lower threshold of detection method) |
Correction with Median |
Median of all data values |
Correction with Median Excluding Outliers |
Median of all data values excluding all outliers |
In the figures below I show some results of the different detection and correction methods for a sample time series using historical periods parameter as 34 and multiplier as 1.
Figure 7: Results using Variance Test and various correction methods
Figure 8: Results using IQR Test and various correction methods
A few observations;
- The standard job log (Figure 9) doesn’t show any information about which data points were detected as outliers or corrected, although workaround solutions can be modelled to achieve this.
Figure 9: Job log of outlier detection and correction
- In the case of IQR Test, the outliers cannot be corrected to a value from the actual series. Example business case: the outlier that falls below the lower threshold should be corrected to the least non-outlier value in the series. Similarly, the outliers that fall above the upper threshold should be corrected to the highest non-outlier value in the series. This is almost like Winsorization (limiting extreme values in the statistical data) which is not possible in IBP for demand.
- While in SAP SPP it is possible to ignore the outliers that fall below the lower tolerance lane (as it’s common to have low or zero demand in spare parts business), it is not possible in IBP for demand to ignore just the lower tolerance lane.
Conclusion
The same as APO, for data sets exhibiting high degree of seasonality or higher trend values, the IBP for demand outlier methods may not work that well. As far as outlier detection/correction is concerned, there are no hard and fast rules to be followed by any organization. Since IBP for demand doesn’t offer flexibility to implement customer specific algorithms, the customers are forced to pick one of the standard techniques. Although some customer specific computations might be achieved through key figure calculations, more complex algorithms would require the use of L-code which only SAP can define.
So are the standard methods available sufficient? Statisticians or data scientists are better placed to answer this question, but as a stepping stone, I would suggest trying the various methods and comparing against your existing outlier detection/correction methods.
In the recent 1611 What’s New webinar from SAP, no new outlier correction/detection methods have been introduced. Will the ex-post method be offered in future releases? Would the “Forecast engine extensibility” mentioned by SAP on the IBP for demand roadmap bring any extra enhancements? I’m also eagerly waiting to see!
I do hope this blog was helpful, please do share your views and click here for a full list of other Olivehorse blogs.
Suresh Sellaiah
Senior SCM Consultant, Olivehorse Consulting