We’re going to show you how to calculate a percentile in R. This is particularly useful when you’re doing exploratory analysis and reporting, especially if you’re analyzing data which may not be normally distributed.
We’re going to use the r quantile function; this utility is part of base R (so you don’t need to import any libraries) and can be adapted to generate a variety of “rank based” statistics about your sample.
To calculate a percentile in R, set the percentile as parameter of the quantile function. See the example below.
# percentile in r example > test = c(9,9,8,9,10,9,3,5,6,8,9,10,11,12,13,11,10) > quantile(test, .27) 27% 8.32
Need to calculate a percentile in R despite missing values in your data? You can use the na.rm option to remove missing values before the calculation. Sample code shown below:
# calculate percentile in R with missing values > othertest = c(9,9,8,9,10,9,3,5,6,8,9,10,11,12,NA,NA,NA) > quantile(othertest,.23) Error in quantile.default(othertest, 0.23) : missing values and NaN's not allowed if 'na.rm' is FALSE > quantile(othertest,.23, na.rm=TRUE) 23% 7.98
Finally, you have the option of generating multiple percentiles using the same function call; explicitly declare the “prob” option and pass the percentiles as a vector rather than using a single percentile value. This is particularly useful if you need to quickly size up a distribution.
# calculate percentile in R - multiple values > test = c(9,9,8,9,10,9,3,5,6,8,9,10,11,12,13,11,10) > quantile(test, prob=c(.1,.25,.5,.75,.9)) 10% 25% 50% 75% 90% 5.6 8.0 9.0 10.0 11.4
If you need a quick way to check a variable, you can also use the summary function. It addresses most of the example above…
# calculate percentile in R - summary function > test = c(9,9,8,9,10,9,3,5,6,8,9,10,11,12,13,11,10) > summary(test) Min. 1st Qu. Median Mean 3rd Qu. Max. 3.000 8.000 9.000 8.941 10.000 13.000