One simple way for CFOs to manage forecast uncertainties (using Monte Carlo and R)

One simple way for CFOs to manage forecast uncertainties (using Monte Carlo and R)Tristan GanryBlockedUnblockFollowFollowingMay 8, 2018There are countless ways for Finance professionals to forecast sales and the intent is not to cover them.

The goal of this article is to help finance professionals better manage forecast uncertainties using using Monte Carlo and R by being able to answer one of the following questions for the business:“What’s the likelihood your organization will reach the target/goal?”“What’s your organization’s forecast risk profile?”It’s no longer just about estimating targets (for example: reforecasting Q4 based on what we know happened in the past).

It’s about evaluating the level of risk within a target and having a data-driven business conversation about what we do about it to achieve the desired outcomes.

It will also give Finance professionals the opportunity to be much more strategic in the way they manage their forecast (e.

g.

is it time to consider including a new opportunity coming down the pipe? should we wait? did we over or under promise?).

While it’s relatively easy to do Monte Carlo simulations in Excel, I personally prefer using R given the breath of libraries that can be leveraged for fitting distribution and for generating sample data for the Monte Carlo Simulation.

For this exercise, we will use a simplified case and it requires few libraries.

With a few lines of codes, we will be able to get back to the business with a clear answer of:“The risk is high.

As things are today, it’s unlikely that we will reach target (e.

g.

less than 33% chances).

Let’s come up with a plan to bridge that gap.

”What we need for this example:A business question / context — “We closed Q3 at 141.

7m and we have an annual target of 186.

0m; what’s our probability to achieve it?”An observed time series — quantity sold of a certain product ABC by month.

An annual target — at least 186.

0m products ABC sold by end of year.

The approach we will use to answer the question is as follows:Step 1: Prep the data.

Step 2 A: Fit a traditional distribution to the observed time series.

Step 2 B: Fit a “supplemental” distribution to the observed time series.

Step 3: Generate a data table with 10,000 samples for October, November, and December.

Step 4: Generate probabilities and compare both options (Option A = 26.

75% / Option B = 33.

2%).

A big takeaway, in that example, is that being more precise by leveraging a better fitted distribution doesn’t significantly change the business outcome (e.

g.

5% risk difference between the 2 approaches).

Step 5: Report to the business that the risk is high and, as things are today, it’s unlikely that they will reach target (e.

g.

less than 33% chances).

Then, come up with a plan to bridge that gap.

Step 1: Prep the data| library(RCurl)# Read datamydata <- read.

csv(text = getURL("https://raw.

githubusercontent.

com/tristanga/MonteCarlo_ForecastRisk/master/TS.

csv"))# Create time serietseries <- ts(mydata\$x, frequency = 12, start = c(2000, 1))# Check the time seriestart(tseries)end(tseries)frequency(tseries)# Remove Q4tseries_sub <- window(tseries, start=c(2000, 1), end=c(2015,9))# Define your targetmytarget = 186.

0000# Calculate actualsactualYTD <- sum(window(tseries, start=c(2015, 1), end=c(2015,9)))# Check the distribution of your time seriehist(tseries_sub)boxplot(tseries_sub)tseries_df = as.

data.

frame(tseries_sub)Step 2 A: Fit a traditional distribution to the observed time series.

|library(fitdistrplus)fit.

norm <- fitdist(as.

numeric(tseries_df\$x), "norm")fit.

exp <- fitdist(as.

numeric(tseries_df\$x), "exp")fit.

weibull <- fitdist(as.

numeric(tseries_df\$x), "weibull")fit.

lnorm <- fitdist(as.

numeric(tseries_df\$x), "lnorm")fit.

gamma <- fitdist(as.

numeric(tseries_df\$x), "gamma")fit.

logistic <- fitdist(as.

numeric(tseries_df\$x), "logis")fit.

cauchy <- fitdist(as.

numeric(tseries_df\$x), "cauchy")# Compare Goodness-of-fit statisticsgofstat(list(fit.

norm, fit.

exp, fit.

weibull,fit.

lnorm,fit.

gamma,fit.

logistic,fit.

cauchy), fitnames = c("fit.

norm", "fit.

exp", "fit.

weibull","fit.

lnorm","fit.

gamma","fit.

logistic","fit.

cauchy"))# the best Goodness-of-fit statistics is for the normal distributionoption1 = fit.

normsummary(option1)Step 2 B: Fit a “supplemental” distribution to the observed time series.

|# Using Supplementary Distributions to fit the second optionlibrary(SuppDists)parms<-JohnsonFit(as.

numeric(tseries_df\$x), moment="quant")# plot the distributionhist( as.

numeric(tseries_df\$x) , freq=FALSE)plot(function(x)dJohnson(x,parms), 0, 20, add=TRUE, col="red")# let's create samples for october, november and decemberoption2 <- function(x)qJohnson(x,parms)Step 3: Generate a data table with 10,000 samples for October, November, and December.

|#Option 1library(truncnorm)fit.

coef <- coef(fit.

norm)final_df1 <- as.

data.

frame(rtruncnorm(n=10^4, a=min(tseries_df\$x), b=max(tseries_df\$x), mean=fit.

coef["mean"], sd=fit.

coef["sd"]))colnames(final_df1) <- 'Oct'final_df1\$Nov <- rtruncnorm(n=10^4, a=min(tseries_df\$x), b=max(tseries_df\$x), mean=fit.

coef["mean"], sd=fit.

coef["sd"])final_df1\$Dec <- rtruncnorm(n=10^4, a=min(tseries_df\$x), b=max(tseries_df\$x), mean=fit.

coef["mean"], sd=fit.

coef["sd"])final_df1\$Forecast <- actualYTD + final_df1\$Oct + final_df1\$Nov +final_df1\$Dec#plot histogram of forecasted quantitieshist(final_df1\$Forecast)#Option 2option2 <- function(x)qJohnson(x,parms)option2sample <- option2(runif(10000))hist(option2sample)boxplot(option2sample,as.

numeric(tseries_df\$x) )final_df2 <- as.

data.

frame(option2(runif(10000)))colnames(final_df2) <- 'Oct'final_df2\$Nov <- option2(runif(10000))final_df2\$Dec <- option2(runif(10000))final_df2\$Forecast <- actualYTD + final_df2\$Oct + final_df2\$Nov +final_df2\$Dec# Plot Option 2hist(final_df2\$Forecast)Step 4: Generate probabilities and compare both options (Option A = 26.

75% / Option B = 33.

2%).

|boxplot(final_df1\$Forecast,final_df2\$Forecast)myproba1 <- sum( final_df1\$Forecast >= 186 ) / 100myproba2 <- sum( final_df2\$Forecast >= 186 ) / 100Step 5: Report to the business that the risk is high and, as things are today, it’s unlikely that they will reach target (e.

g.

less than 33% chances).

Then, come up with a plan to bridge that gap.

Next steps:Increase complexity (distribution of each month vs.

entire time series, profit vs.

quantity sold, include growth/CAGR in the model, external variables, etc.

).

Write about other relevant statistical methods for forecasting quantitative time series for Finance professionals.

Please note that there are some risks/considerations to keep in mind with the Monte Carlo approach: 1) We are using the past to predict the future without accounting for any external factors beyond previous quantity sold distribution (population, price, etc.

), and 2) The observed time series distribution will change over time; while the model in option 2 should follow the changes it will be important to check from time to time that it still follows closely the selected distribution.

The code is available on Github.

Thank you for reading my post.

It is my first post on Medium.

Feel free to contact me if you have faced similar challenges regarding how you solved them or if you want help implementing it in your organization.

.