How to Fix the R error in colmeans(x, na.rm = true) : ‘x’ must be numeric

When you are doing a Principal Components Analysis, you will get the “error in colmeans(x, na.rm = true) : ‘x’ must be numeric” error message if one of your columns has characters or other non-numeric values. Fortunately, there is a simple solution for fixing this problem. It simply involves translating a factor variable into a numeric variable.

Description of the error

This error message occurs because when you are doing a Principal Components Analysis, the values of each column of your data frame have to have numeric values. If it has characters or other non-numeric values such as missing values, you will get our error message. This occurs because the prcomp function only works with numeric values, so as a result, you will get an error message if the values are not numeric. As a result, if you need to run this kind of analysis, you need to make sure that you are giving it only numeric values. If you give it non-numeric values, you will get our error message.

Explanation of the error

The following example contains code that produces our error message. You should note column Z of the data frame.

> t = as.numeric(Sys.time())
> set.seed(t)
> z = c(“A”, “B”, “C”, “D”, “E”)
> x = rnorm(5)
> y = rnorm(5)
> df = data.frame(z, x, y)
> df
z x y
1 A 0.02307778 0.41365815
2 B 0.63213959 0.77502100
3 C -0.91366753 1.83374930
4 D 0.90422176 -0.09915274
5 E 0.75987927 -0.77146351
> pr = prcomp(df)
Error in colMeans(x, na.rm = TRUE) : ‘x’ must be numeric

If you look at data frame df you will notice that column Z has characters instead of numbers. It is this fact that triggers our error message because it is looking only for numeric values.
<h2>How to fix the error.</h2>Here we have an example of how to fix this problem. As long as you can convert the column into a factor, you can easily convert it into a numeric value. This is what we do in this example, and it fixes the problem.

> t = as.numeric(Sys.time())
> set.seed(t)
> z = c(“A”, “B”, “C”, “D”, “E”)
> x = rnorm(5)
> y = rnorm(5)
> df = data.frame(z, x, y)
> df
z x y
1 A 1.0158299 1.3621230
2 B -1.0393691 -0.4218296
3 C 0.1113177 0.5536360
4 D 1.8122020 1.1435097
5 E -1.3957393 0.9001602
> df2 = df
> df2$z = as.numeric(as.factor(df2$z))
> df2
z x y
1 1 1.0158299 1.3621230
2 2 -1.0393691 -0.4218296
3 3 0.1113177 0.5536360
4 4 1.8122020 1.1435097
5 5 -1.3957393 0.9001602
> pr = prcomp(df2)
> pr
Standard deviations (1, .., p=3):
[1] 1.664002 1.351141 0.469779

Rotation (n x k) = (3 x 3):
PC1 PC2 PC3
z 0.86675116 -0.4768597 -0.1461071
x -0.49435044 -0.7826437 -0.3782676
y -0.06603079 -0.4000920 0.9140932

If you will take a look at the difference between data frames df and df2, you will see that in df2 column Z is a series of numbers rather than letters. This conversion was accomplished by converting the column into a factor and then converting the factor into a list of numeric values.

This error message results from a simple mistake to make, but one that is also easy to fix. It is a simple matter of making sure that what you are putting through a Principal Components Analysis is only a numeric variable. This one simple correction will allow you to do the analysis without any errors. This means that you will get the results that you are looking for.

Scroll to top