Percentile in R – Efficient Ways To Calculate

We’re going to show you how to calculate a percentile in R. This is particularly useful when you’re doing exploratory analysis and reporting, especially if you’re analyzing data which may not be normally distributed.

We’re going to use the r quantile function; this utility is part of base R (so you don’t need to import any libraries) and can be adapted to generate a variety of “rank based” statistics about your sample.

To calculate a percentile in R, set the percentile as parameter of the quantile function. See the example below.

# percentile in r example
> test = c(9,9,8,9,10,9,3,5,6,8,9,10,11,12,13,11,10)
> quantile(test, .27)

Need to calculate a percentile in R despite missing values in your data? You can use the na.rm option to remove missing values before the calculation. Sample code shown below:

# calculate percentile in R with missing values
> othertest = c(9,9,8,9,10,9,3,5,6,8,9,10,11,12,NA,NA,NA)
> quantile(othertest,.23)
Error in quantile.default(othertest, 0.23) : 
  missing values and NaN's not allowed if 'na.rm' is FALSE
> quantile(othertest,.23, na.rm=TRUE)

Finally, you have the option of generating multiple percentiles using the same function call; explicitly declare the “prob” option and pass the percentiles as a vector rather than using a single percentile value. This is particularly useful if you need to quickly size up a distribution.

# calculate percentile in R - multiple values
> test = c(9,9,8,9,10,9,3,5,6,8,9,10,11,12,13,11,10)
> quantile(test, prob=c(.1,.25,.5,.75,.9))
 10%  25%  50%  75%  90% 
 5.6  8.0  9.0 10.0 11.4 

If you need a quick way to check a variable, you can also use the summary function. It addresses most of the example above…

# calculate percentile in R - summary function
> test = c(9,9,8,9,10,9,3,5,6,8,9,10,11,12,13,11,10)
> summary(test)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  3.000   8.000   9.000   8.941  10.000  13.000 

Related Materials