Fixing the R error: contrasts can be applied only to factors with 2 or more levels

When doing a linear model, this error message shows up when the data frame you are analyzing has a factor column that has only one level. That means that not only is it a column of factors, but also that every row has the same value. The easiest way to eliminate this r error is to exclude that column from the model fitting.

Description of the R error – contrasts can be applied only to factors with 2 or more levels

Factors are a data type that often serve as a dummy variable. They are often a categorical variable that serves as an indicator of some trait that is being studied. These indicator variables usually have values of zero or one to show the presence of the trait. These independent variables often serve as a random effect in the modeling. They are often used in orthogonal contrasts and treatment contrasts, but this requires the factor to have more than one level, and if it has only one level it will trigger our error message. When a factor has two or more levels, it can be used to find the mean of the values associated with a given factor level. This can also be used to calculate important statistical data such as marginal means and effect sizes.

Explanation of the R error

Here are two examples that illustrate this error message.

> df = data.frame(A=c(1, 4, 3, 4, 5),
+ B=as.factor(3),
+ C=c(4, 2, 8, 3, 2),
+ D=c(1, 8, 2, 8, 9))
> df
A B C D
1 1 3 4 1
2 4 3 2 8
3 3 3 8 2
4 4 3 3 8
5 5 3 2 9
> lm(D ~ A + B + C, data=df)
Error in `contrasts`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels

In this example, the factor level of column B is one because all of the rows have the same value. As a result, it triggers our message.

> df = data.frame(A=c(1, 4, 3, 4, 5),
+ B=as.factor(c(3, 3, 5, 3, 3)),
+ C=c(4, 2, 8, 3, 2),
+ D=c(1, 8, 2, 8, 9))
> df
A B C D
1 1 3 4 1
2 4 3 2 8
3 3 5 8 2
4 4 3 3 8
5 5 3 2 9
> lm(D ~ A + B + C, data=df)

Call:
lm(formula = D ~ A + B + C, data = df)
Coefficients:
(Intercept) A B5 C
-8.889e-01 2.111e+00 -3.444e+00 3.140e-16

In this example, we change one of the values in column B raising the factor level to two. Not only does it not trigger the message, but it also produces a coefficient for the dependent variable for column B. This shows the cause of the problem is column B having only one value while being a factor column.

How to fix the R error

Here are two options for fixing this problem.

> df = data.frame(A=c(1, 4, 3, 4, 5),
+ B=as.factor(3),
+ C=c(4, 2, 8, 3, 2),
+ D=c(1, 8, 2, 8, 9))
> lm(D ~ A + C, data=df)

Call:
lm(formula = D ~ A + C, data = df)

Coefficients:
(Intercept) A C
2.1040 1.7790 -0.6717

This example simply excludes column B as a variable in the model because it is unnecessary. This simple step fixes the problem.

> df = data.frame(A=c(1, 4, 3, 4, 5),
+ B=c(3, 3, 3, 3, 3),
+ C=c(4, 2, 8, 3, 2),
+ D=c(1, 8, 2, 8, 9))
> lm(D ~ A + B + C, data=df)

Call:
lm(formula = D ~ A + B + C, data = df)

Coefficients:
(Intercept) A B C
2.1040 1.7790 NA -0.6717

This example fixes the problem by changing column B into an ordinary numeric vector. The result is that column B is given a NA value as a coefficient.

This is an easy problem to understand. You are most likely to get this error message if you are dealing with data that you do not have control over. However, it is an easy problem to fix because the column that is causing the problem can be easily removed from the model.