How to Calculate Standard Error in R

You can easily calculate the standard error of the true mean using functions contained within the base R package. Use the SD function (standard deviation in R) for standalone computations.

# Calculate Standard Error in R
> product_tests <- c(15,13,12,35,12,12,11,13,12,13,15,11,13,12,15)

# Calculate Standard Error in R 
# using the SD function / SQRT of vector length

> sd(product_tests)/sqrt(length(product_tests))
[1] 1.519607

One annoying quirk of real life data sets is they often have missing values. You can use the na.rm option and na.omit function as noted below (for the standard deviation in r function) to clean up the missing values and calculate the standard error using only the real values of the series.

# Calculate Standard Error in R

> product_tests <- c(15,13,12,35,12,12,11,13,12,13,15,11,13,12,15, NA, NA, NA)

> product_tests
 [1] 15 13 12 35 12 12 11 13 12 13 15 11 13 12 15 NA NA NA

> sd(product_tests, na.rm=TRUE)/sqrt(length(na.omit(product_tests)))
[1] 1.519607

Uses of the Standard Error in R

The standard error of a statistic is the estimated standard deviation of the sampling distribution. This is generated by repeatedly sampling the mean (or other statistic) of the population (and sample standard deviation) and examining the variation within your samples. This statistic is commonly included in summary statistics and descriptive statistics views. It is important in a test or experiment that you use a random sample method to get the most accurate data point model, so that your barplot or other data model is the most accurate, and closest to a normal distribution.

The standard error tells you how accurate the mean of a given sample is relative to the true population mean. If you’ve got a large standard error, your statistic is likely to be less accurate. As sample sizes increase, sample means cluster more closely around the true mean.

As is indicated by the math above, the sample size affects the standard error (scales with the square root of the sample size). This also has implications for the confidence interval for your estimate of the population mean.

Applications to Regression Analysis

Note that for a linear regression model, the residual standard error refers to the square root of the reduced chi-squared statistic or the standard error for a specific logistic regression coefficient. This helps you interpret the predicted value and find the correlation coefficient of the model. Having accurate data point models helps you accurately plot a regression line, which can help you find values like the estimated standard deviation, as well as the variance and covariance of your random variable.

Related Materials