R users engaged in simulation modeling and probability analysis will often wish to simulate random events. Indeed, random numbers are a common element in many computer languages. This is particularly so for games, but it has applications in statistical data analysis as well.
R, like many other programming languages, has several pseudo-random number generator functions. A key concern when using them is the degree of randomness involved in the numbers. This takes two forms:
- Comfort level that a simulated trial is sufficiently random, from the perspecitve of the users
- When debugging, the ability to force adequate testing of your code across a consistent set of “random” data.
In either case, you will want to use the set.seed function in R to control the degree of randomness in your random numbers.
R’s Random number generator
Random number generators are a common function found in programming languages and R is no exception. R has nine random number generators based on nine distinct statistical distributions (specific functions listed below). The problem is that computers cannot generate truly random numbers, which is why they are, referred to a pseudo-random number generator. One way of adding true randomness to random to number generation is basing the number on something outside the computer itself such as the user. In all cases, the pseudo-random number generator uses a number called a seed to determine the next number in the sequence.
Random number seed
A random number seed is an integer used by R’s random number generator to calculate the next number in a sequence. By setting this number, you can ensure that the sequence of numbers is always the same. The converse is true as well – by ensuring the seed for your random number generator is especially variable (minute fractions of time on a clock), you can take steps to ensure that it is very random. After the seed is used by a pseudo-random number generator, it then updates the seed to use it to calculate the next number. It usually uses the last number that is produced.
set.seed in r
Set.seed in r is the random number seed function for R. It has the form set.seed(number) where number is a whole number value. If you enter a number with a decimal component the set.seed function only uses the whole number value. To produce a pseudo-random string of numbers, this value is updated after every number generation as is shown in this example. The example below compares the effect of set.seed on two sets of points randomly selected from a normal distribution using rnorm.
# set.seed in R # random number generator control example # default results - generates random numbers > rnorm(4)  -1.3754972 0.1967718 -1.4098400 -1.3294793 > rnorm(4)  0.3879453 0.6372902 -0.7712165 -1.0915287 > rnorm(4)  -0.1425324 0.3312602 0.1960673 -1.3108896 # controlled random numbers using set.seed in r > set.seed(4) > rnorm(4)  0.2167549 -0.5424926 0.8911446 0.5959806 # reset the seed to same seed value, get same result > set.seed(4) > rnorm(4)  0.2167549 -0.5424926 0.8911446 0.5959806
Once you set this seed you cannot unset it but you can reset it as often as you want. When you combine this with nine different pseudo-random number generators R’s power for producing random numbers increase greatly.
Random number generation in R
R has nine pseudo-random generators they are as follows.
- Uniform Distribution – runif(number, minimum, maximum)
- Normal Distribution – rnorm(number, mean, standard deviation)
- Binomial Distribution – rbinom(number, size, probability)
- The log-normal Distribution – rlnorm(number, mean log, standard deviation log)
- Weibull Distribution – rweibull(number, shape, scale)
- Exponential Distribution – rexp(number, rate)
- Poisson Distribution – rpois(number, lambda)
- Gamma Distribution – rgamma(number, shape, rate)
- Chisquare Distribution – rchisq(number, degrees of freedom, non-centrality parameter)
Each of these functions generates a pseudo random number based on the respective distribution. Combined these add a lot of flexibility to the random number generation within R.
The applications for these random number generators are numerous. They are helpful in understanding and analyzing statistics. For example, being able to reproduce the results from a statistical study with the appropriate random number generator suggests, the results were just noise. Another obvious application for such a vast array of random number generators is in game theory.
Set.seed is an important function in R. It enables you to guide the starting point for all of the nine random number generators. Without this ability to instill controlled randomness, a pseudo-random number generator does not work. In the real world, this means adequate computer statistical analysis does not exist and you lose (or win by default) a lot of video games.