This post is about forecasting. I’m definitely not an expert in this area. I share what I’ve been studying recently and play with some ideas hoping this will maybe inspire you to do some research and rethink your forecasting methods. 99% of what I put here is based on a brilliant book by Daniel S. Vacanti “When Will It Be Done?” (which you should definitely read, BTW).
Let Us Hire Some Developers
As every other IT company you are hiring, right? You have a budget for X people to hire this year, and you would like to know “when will it be done”,? (meaning, when they will be hired). Let us say, that all you know is that in the previous weeks your recruitment team managed to hire:
- week 1 – 0 new hires
- week 2 – 1
- week 3 – 2
- week 4 – 0
- week 5 – 4
Now, assuming you want to hire 25 people, when will it be done?
Averages FTW!
Obviously, you can base your expectations (forecasts) on average.
So far we hired 7 people in 5 weeks, which gives an average of 1.4 new hires per week. Now you calculate 25/1.4 and you learn that you need ~18 weeks. And because 5 weeks have already passed you expect to finish in week 23.
Let us say another week passes, and this time we have:
- week 6 – 3 new hires
So what can you do? Well, you repeat the calculations. 10/6 = 1.7 and then 22/1.7 = ~14 weeks. Hurray! It seems you will finish in week 20.
The One Number To Fool Them All
As we can see a forecast based on averages is a single number. Very convenient, isn’t it? Definitely easy to communicate to your boss.
It simply feels good to give one number and be convinced (and convince others) that this is THE number that matters. Yes, it is always fun to fool yourself and others. 🙂
So, When Will It Be Done?
According to our (perfect) calculations performed after week 6, we expect to hire all people in week 20. Are we certain this will happen? Definitely not!
We will be done in 20 weeks only if we keep the same pace as we had during the first six weeks.
It is enough to compare the average hiring speed of the first five weeks (1.4) to the average speed of hiring in the first six weeks (1.7) to see that the pace changes over time. So we probably won’t do it in exactly 20 weeks. Maybe sooner, maybe later – but with what certainty? We have no clue based on average solely. All we know is this one number.
Hm, doesn’t it bother you? It bothers me for sure!
Forecasting Result Depends on the Past You Use
…but before we move on to another method, let me share with you this one bit of forecasting wisdom. No matter the forecasting technique we use, one thing is certain:
“The future that you predict is going to resemble the parts that you used to predict it”.
Danie Vacanti, “When Will It Be Done?”
Which basically means that the input determines the output. And depending on the part of the future we take into account for our estimations we will end up with different results.
For example, when, we could decide to take only the last 4 weeks into account, which would significantly change our predictions.
Monte Carlo to the Rescue – A Stronger Forecasting Technique
OK, enough of the averages, let us try a different approach – Monte Carlo Simulation. We will start with the initial setting:
- week 1 – 0 new hires
- week 2 – 1
- week 3 – 2
- week 4 – 0
- week 5 – 4
If you run a Monte Carlo Simulation ten thousand times you will end up with something similar to this:
Each bar represents the probability of hiring 25 people in a specific week or earlier. For example, 75.98% percent in week 21 means that we can assume with this probability that all 25 will be hired by then (maybe earlier, maybe exactly in week 25, but they will be hired).
So, When Will It Be Done?
As you can see the earliest we are able to hire all 25 people is after 7 weeks. If you think about it, it makes perfect sense – historically the best our hiring team did was 4 per week, and if they happen to do it every single week they would finish in 7 weeks (7×4 = 28). However, it is not very likely that the recruitment team will reach their “personal best” every week. In fact, our chart shows we have only 0.03% to finish in 7 weeks!
The other side of the chart shows that we can be almost certain that we finish the job in week 38 (or sooner). In fact, only a very unlucky streak of events (hiring 0 developers almost every week) would make the hiring process last longer than 38 weeks.
And somewhere in between there is this point for which you are ready to say “we will make it in X weeks”. Which number is it for you? Is it week 18 when we break the 50%? Or maybe week 21 with probability close to 85%? Or maybe you need more confidence and you feel week 23 – around 85% – is the tipping point?
Forecast = Range + Probability
Compared to our averages-based approach we can see that our forecast is no longer a single number, but a list of numbers each with prescribed probability (uncertainty). Hm, interesting. So we don’t say “we will finish in 30 weeks”, but we say “we will finish in 30 weeks or earlier with X% of probability”, “we will finish in week 26 or earlier with Y% probability” and so on.
Now have a look at this chart:
It presents a probability of finishing in a certain week. It is most probable that we finish in weeks 17 and 18… Hm… interesting, this resembles the outcome of our average, right?
It seems that using averages is like taking the most probable outcome and saying “this is what will happen”. It is like throwing two dice and saying “I expect they sum up to 7” (7 happens to be the most probable outcome for such experiment). Not very smart, right?
One Week Later
Out of curiosity, let us see what happens next week. We have one more data point to consider:
- week 6 – 3
After rerunning Monte Carlo Simulation (10k times again) we will see something like this:
And if you are interested in how the two compares, have a look here:
As you can see, the new information – hiring 3 developers in week 6 – “moved” the bars to the left increasing the probability of finishing the hiring process sooner than we expected after week 5.
Forecasting: Averages vs. Monte Carlo
In this blog post I’ve presented two approaches to predicting future (forecasting) based on past outcomes. The first one is based on averages. It is dead simple and doesn’t offer a lot of forecasting power. The second approach is based on Monte Carlo Simulation. More complicated but provides much more information about possible outcomes, thus allowing us to take better decisions.
Which one to use? Decide for yourself.
P.S. Currently, I tend to overuse Monte Carlo Simulation. Well, you know what they say: “To a man with a hammer, everything looks like a nail.” Very true, especially if somebody falls in love with his new tool 🙂 Happy forecasting!