Many practical business and engineering problems involve analyzing complicated processes. Enter Monto Carlo Simulation. Performing Monte Carlo simulation in R allows you to step past the details of the probability mathematics and examine the potential outcomes.
Setting up a Monte Carlo Simulation in R
A good Monte Carlo simulation starts with a solid understanding of how the underlying process works. For the purposes of this example, we are going to estimate the production rate of a packaging line. We are going to buy a set of machines that make rolls of kitchen towels in this example.
Our converting line makes a big roll of paper on a winder and slices it into smaller rolls that people can use in their homes. Next, we will take each of these rolls and put them in an individual bag (to keep them clean) and then place the bags in a cardboard box (so they don’t get crushed). We think – but don’t know- the production rate of each step of the process.
There is an additional constraint here: the converting line can only produce at the rate of it’s slowest component. So if the winder can make 5000 rolls and the bagger can only bag 1500, the line is limited to the slower machine.
For purposes of this exercise, we believe the process is as follows:
- The winder can make 3000 – 5000 rolls per hour
- The bagger can make 2000 – 4000 rolls per hour
- The case packer can make 150 – 250 cases of 30 rolls each per hour
- The line will product at the slowest of the three
- Assume all distributions are uniform
Coding a Monto Carlo Simulation in R
Using the rules above, we can lay out the simulation model for the process. We are picking three numbers from a uniform distribution and taking the minimum of each. This can be done for each hour of machine operation. This is simplified version of reality, but same basic ideas still apply.
We can generate values from the uniform distribution in R using the runif probability function. Thus our model looks like (with some iterations):
# Monte Carlo Simulation in R Example
> min(runif(1,3000,5000), runif(1,2000,4000),runif(1,150,200)*30)
[1] 3427.724
> min(runif(1,3000,5000), runif(1,2000,4000),runif(1,150,200)*30)
[1] 2344.261
> min(runif(1,3000,5000), runif(1,2000,4000),runif(1,150,200)*30)
[1] 2543.24
We can build this out into a larger vector of results through iteration.
# full Monte Carlo Simulator in R
results = NULL
for (k in 1:1000)
{
rolls = runif(1,3000,5000)
bags = runif(1,2000,4000)
cases = runif(1,150,200)*30
total = min (rolls, bags, cases)
results = rbind(results, data.frame(rolls, bags, cases, total))
}
So after we run the line for 1000 (virtual) hours, we take a peek at the data:
>head(results)
rolls bags cases total
1 3251.039 2520.863 5317.533 2520.863
2 4338.553 3600.642 5547.647 3600.642
3 4667.194 3524.255 5224.251 3524.255
4 4429.320 2967.733 5405.509 2967.733
5 3038.459 2401.518 5809.195 2401.518
6 3503.558 3254.978 4813.409 3254.978
> summary(results)
rolls bags cases total
Min. :3005 Min. :2003 Min. :4505 Min. :2003
1st Qu.:3502 1st Qu.:2469 1st Qu.:4879 1st Qu.:2469
Median :3954 Median :2978 Median :5258 Median :2978
Mean :3980 Mean :2992 Mean :5253 Mean :2944
3rd Qu.:4482 3rd Qu.:3501 3rd Qu.:5616 3rd Qu.:3378
Max. :4999 Max. :3997 Max. :6000 Max. :3979
Looking at the three components, the case packer is flying. The winder is doing fairly well. The bagger is the constraint. The speed of the overall manufacturing line is limited to the speed of putting the bags onto the rolls.
Simulating Process Improvements
Walking back to your office, you see an older piece of packaging equipment sitting idle. It’s an Ultraflow wrapper, an early version, which can make shrink wrapped bundles of paper towels. Better yet, you can install it next to the bagger, the device that was slowing down your line so that any excess production is goes to this second machine. It also replaces the case packer.
With a couple of small adjustments to the calculations, we can simulate the performance of the redesigned production line….
# improved process - Monte Carlo Simulator in R
results = NULL
for (k in 1:1000)
{
rolls = runif(1,3000,5000)
bags = runif(1,2000,4000)
ultraflow = runif(1,500,1000) * 8
cases = runif(1,150,200)*30
total = min (rolls, bags + ultraflow, cases + ultraflow)
results = rbind(results, data.frame(rolls, bags, cases, ultraflow, total))
}
Running some virtual hours of production, we see this changes the game.
> head(results)
rolls bags cases ultraflow total
1 4631.601 3251.744 5785.385 5703.448 4631.601
2 4204.387 3315.834 4976.051 7739.523 4204.387
3 3802.148 3776.298 5999.492 7589.712 3802.148
4 4021.700 2950.878 4757.196 6277.988 4021.700
5 4695.162 2019.212 5821.022 5455.306 4695.162
6 3125.197 3887.420 4783.042 7506.470 3125.197
> summary(results['total'])
total
Min. :3005
1st Qu.:3501
Median :3992
Mean :3987
3rd Qu.:4463
Max. :5000
Well that certainly made a difference! Production per hour is up 1000 units. The new piece of equipment sped up packaging, so we’re now limited by the speed of our paper roll winding machine. The next step (in the real world) would be to do some physical trials to ensure everything works as expected.
Other Applications of Monte Carlo Simulation
The beauty of using Monte Carlo Simulation in R to explore a problem is you’re able to explore very complicated problems with limited statistical effort. If you can simulate the process in code, you’re in business.
For the industrial example above, we could have incorporated other factors into the model such as operating conditions or worker skill level. You could have includes factors such as setup time, downtime / maintenance, and random failures or supply problems. You could have implemented other constraints like the availability of raw materials, orders, or storage space.
I’ve used Monte Carlo simulation for financial modeling, looking at the likelihood of a company running out of cash. The same concepts can be used to test the likelihood of successfully launching a product or getting a rigorous estimate of how long it will take to generate significant sales.
In the sciences, the same techniques can be used for natural events. And for our friends in social sciences, you can use Monte Carlo simulation for everything from modeling how fast information moves on a social network to teenager trends in high school. Oh wait… nobody understand those….