Fixing R error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : na/nan/inf in 'y'

You can get this error message if you do a linear model of a data frame, and it has an “Inf” value. Fixing the problem requires removing the offending “Inf” value from any row that contains it. This is done by converting it into a na value. This will clear up this problem giving you trouble-free results.

Description of the R error

The linear model function is not a part of the ggplot package, nor is it related to the boxcox formula. It does however use the lmfit function. This is evident because the lmfitx, y function shows up in the message. The lmfit function kicks out this message if any of the variables being entered as arguments has an “Inf” value. Despite what the message says you do not get this error from a data frame that only contains “NA” and “NaN” values. Once you convert any “Inf” values to “NA” values the formula will work just fine by ignoring them.

Explanation of the R Error

This group has two code examples. The first produces the message, and the second does not. The comparison between these two examples shows what really causes this message.

> df = data.frame(y = c(1, 2, 3, 4 , 5, 6, 7, 8, 9, 10),
+ x = c(1, 2, NA, 4, 5, Inf, 7, 8, NaN, 10))
> df
y x
1 1 1
2 2 2
3 3 NA
4 4 4
5 5 5
6 6 Inf
7 7 7
8 8 8
9 9 NaN
10 10 10
> lm(y ~ x, df)
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, …) :
NA/NaN/Inf in ‘x’

The example data in this example contains the “NA”, “Inf” and “NaN” values. As a result, it produces the message.

> df = data.frame(y = c(1, 2, 3, 4 , 5, 6, 7, 8, 9, 10),
+ x = c(1, 2, NA, 4, 5, 6, 7, 8, NaN, 10))
> df
y x
1 1 1
2 2 2
3 3 NA
4 4 4
5 5 5
6 6 6
7 7 7
8 8 8
9 9 NaN
10 10 10
> lm(y ~ x, df)

Call:
lm(formula = y ~ x, data = df)

Coefficients:
(Intercept) x
-2.512e-15 1.000e+00

The example data contains only the “NA” and “NaN” values. As a result, this example does not produce the message showing that the “Inf” value is the one causing the problem.

How to fix the R Error

In this section the example code shows how to fix the problem by changing each “Inf” and “NaN” value into a na value.

> df = data.frame(y = c(1, 2, 3, 4 , 5, 6, 7, 8, 9, 10),
+ x = c(1, 2, NA, 4, 5, Inf, 7, 8, NaN, 10))
> df
y x
1 1 1
2 2 2
3 3 NA
4 4 4
5 5 5
6 6 Inf
7 7 7
8 8 8
9 9 NaN
10 10 10
> df2 = df
> df2[is.na(df2) | df2== “Inf”] = NA
> df2
y x
1 1 1
2 2 2
3 3 NA
4 4 4
5 5 5
6 6 NA
7 7 7
8 8 8
9 9 NA
10 10 10
> lm(y ~ x, df2)

Call:
lm(formula = y ~ x, data = df2)

Coefficients:
(Intercept) x
-1.343e-15 1.000e+00

The example data in this example contains the “NA”, “Inf” and “NaN” values. The problem is fixed by changing the “Inf” and “NaN” values into “NA” values. This is a simple little fix that will prevent this problem from occurring even without looking over the data.

This is a rather tricky error message to understand because it does not provide you with sufficient information as to what the problem really is. At first glance it looks more complicated than it is, making it hard to figure out. However, once you understand it and how to fix it this error message is extremely easy to fix. So do not let this message intimidates you. While the solution is not intuitive, it is simple. Once you understand how to fix it, it will not be a problem any longer.