Many statistical and business analysis projects will require you to select a sample from a list of values. This is particularly true for simulation requests. To select a sample, r has the sample() function. This function can be used tfor combinatoric problems and statistical simulation.

Tempers flare a bit when you talk about random samples in certain audiences. This article is going to focus on the essence of using sample () to select values from a list. We are also going to briefly discuss more advanced options for sampling and random number generation.

### R Sample() – Random Selections From A List

R has a convenient function for handling sample selection; sample(). This function addresses the common cases:

- Picking from a finite set of values (sampling without replacement)
- Sampling with replacement
- Using all values (reordering) or a subset (select a list)

The default setting for this function is it will randomly sort the values on a list. These are returned to the user in random order. Sample code is below:

sample (vector_of_values) sample (c(1:10))

This request returns the following:

[1] 7 8 2 9 1 4 6 3 10 5

As you can see, we’ve shuffled the list of the first 10 numbers into a different order.

But what if a value can be selected multiple times? This is known as sampling with replacement. Sample supports this via an additional parameter: *replace. *Replace can be T (true) or F (false). The default case assumes no replacement. Code example looks like:

sample (c(1:10), replace =T)

Yielding the following result. As you can see, certain values are repeatedly picked.

[1] 4 7 10 9 4 6 6 4 3 4

We can add the size parameter to return only a few values. The following code will pick three values.

sample (c(1:10), size=3)

Yielding the following result.

[1] 3 6 8

The same result with replacement turned on…. (carefully selected)

sample (c(1:10), size=3, replace=T) [1] 9 9 1

It took a couple of trials to get that random selection.

As a practical use case, we can use this to figure out who will pick up the bar tab for a R meetup.

sample (c('Joe','Karl','Jack','Larry','Curly', 'Moe','Kim','Kathy','Sam','Jim'), size=1) [1] "Kim"

Drinks are apparently on Kim this week.

### Adjusting Probabilities

The prior examples assume we are selecting values at random from a list. But R sample also allows us to adjust the probability of each item being selected. We do this with the *prob argument.*

Our next example imagines us on a factory floor. We make widgets, which have a certain chance of being defective. Our quality isn’t great, so there is a 25% chance of a widget being defective. We can simulate this using the following code.

sample (c('Good','Bad'), size=6, replace=T, prob=c(.75,.25)) [1] "Bad" "Good" "Bad" "Good" "Good" "Bad"

As you can see, we stumbled upon a particularly bad sample, with even more errors than expected. We would typically expect to find 1 – 2 defects out of 6 trials, if our average defect rate is 25%. Instead, we find three defects. A 50% error rate. Indeed, our client should hire a quality consultant, ideally a consultant who knows R…..

### Generating Random Numbers in R

Our examples up to this point have dealt with random selections from finite sets. But what if we need to generate a true random number using R?

The next part of our tutorial will address generating floating point numbers and values from a specific statistical distribution.