This article about R’s rpois function is part of a series about generating random numbers using an R function. The rpois function can be used to simulate the Poisson distribution. It is commonly used to model the number of expected events concurring within a specific time window.
Our earlier articles in this series dealt with:
R and the Poisson Distribution
We’re going to start by introducing the rpois function and then discuss how to use it.
The Poisson distribution is commonly used to model the number of expected events for a process given we know the average rate at which events occur during a given unit of time. The Poisson model is often used for Poisson regression, logistic regression, and the Poisson probability mass function.
For example, let us assume that 10 shoppers enter a store per minute.
Can we generate a simulation of the number of customers per minute for the next 10 minutes?
We describe the process as:
- A window of observation – a specific time period in which events can occur
- A rate of occurrence – how often is an event expected to occur in that window
- The number of times an event occurs (the observation)
R’s rpois function generates Poisson random variable values from the Poisson distribution and returns the results. The function takes two arguments:
- Number of observations you want to see
- The estimated rate of events for the distribution; this is expressed as average events per period
The expected syntax is:
rpois(# observations, rate=rate )
Continuing our example from above:
# r rpois - poisson distribution in r examples
 6 10 11 3 10 7 7 8 14 12
As you can see, there is some variation in the customer volume. A couple of minutes have seven or eight. One has 6. And apparently there was a mad dash of 14 customers as some point. The Poisson distribution models this type of probability distribution in the expected throughput of a Poisson process.
Practical Uses of Poisson Distribution
The Poisson distribution is commonly used within industry and the sciences.
The classical example of the Poisson distribution is the number of Prussian soldiers accidentally killed by horse-kick, due to being the first example of the Poisson distribution’s application to a real-world large data set. Ten army corps were observed over 20 years, for a total of 200 observations, and 122 soldiers were killed by horse-kick over that time period. The question is how many deaths would be expected over a period of a year, which turns out to be excellently modeled by the Poisson random variable distribution.
Some other examples:
- average number of equipment failures per day for logistics company
- average number of customers arriving at a retailer
- the number of visitors to a web site
- number of inbound phone calls
- number of customer complaints
Related functions: ppois, qpois, dpois
Need a standard probability density function for the poisson distribution?
Example: Customers call us at a rate of 12 per minute. What is the probability of having exactly twenty customers call us within the span of a minute?
You should use R’s dpois probability mass function. You can use this to calculate the probability of getting X events within a period where the rate is Zs. Example code below:
# dpois r - calculate poisson distribution probability in r
The example above indicates the probability of twenty calls in a minute is under 1%.
What if we want to look at the cumulative probability of the poisson distribution?
Example: Customers call us at a rate of 12 per minute. “The boss” wants us to deliver excellent service and stay very productive. Our service will suffer if we get more than twenty calls in a minute. We’re going to look lazy if five or less calls arrive in a minute. What are the odds of getting in trouble with the boss?
For this problem, we’re going to use R’s ppois function, which gives the cumulative probability or expected value of an event- essentially it is a maximum likelihood estimator. This is a digital version of the table of probabilities included as an appendix in your favorite statistics book. It includes the option of specifying if we’re interested in the upper or lower tail of the statistical distribution.
# simulating poisson process r
# cumulative poisson distribution
# ppois r - odds of more than 20 people calling
# default setting uses lower tail of distribution
ppois(20, lambda = 12)
# ppois r - odds of 5 or less people calling
# use lower=FALSE to take the upper tail
ppois(5, lambda = 12, lower=FALSE)
Overall, not bad, although there is a slight probability the boss will be yelling at any given moment for either reason. Eh, what can you do….
What’s the difference between ppois and dpois?
Dpois provides the parameter probability of getting a result for that discrete point on the poisson model, a discrete distribution. Ppois calculates the cumulative probability of getting a result equal to or below that point on the poisson distribution. In the call center example: dpois is the probability of getting 5 calls; ppois calculates the probability of getting 5 or less calls.
Need to set a cutoff score for a given point in the poisson distribution?
Take a look at R’s qpois function, which calculates the inverse poisson distribution, a negative binomial distribution. This is the inverse of the operation performed by ppois. You provide the function with the specific percentile within the cumulative distribution function you want to be at or below and it will generate the expected value of events associated with that cumulative probability on the negative binomial distribution. It is not quite the same as a standard normal distribution, though they are both a discrete distribution a standard normal distribution has a different probability density function than a Poisson model, a chi squared distribution, a weibull distribution, or a logistic distribution.
# r qpois - inverse poisson distribution
qpois(0.25,lambda = 12)
Taken as a group, you can use these functions to generate the poisson distribution in R.
This is part of our series on sampling in R. To hop ahead, select one of the following links: