The “nans produced error in r” is more of a warning than an error message. This is usually not a result of a coding error in your R programming. The problem comes from the content of the data. Specifically, the problem results from the software’s inability to naturally handle imaginary numbers. Few programming languages are designed to naturally use imaginary numbers. As a result, most programming languages complain when their built-in functions produce an imaginary number. If you are a skilled programmer you can work around this to use imaginary numbers but normally you will get an error message.
The circumstances of this error.
This problem occurs when your R programming code contains a log or even root R function where the dataframe contains negative numbers in a normal distribution. Because this data type model produces an imaginary number with such functions they return a nan value because they do not have the ability to handle imaginary numbers.
> sqrt(-1)
[1] NaN
Warning message:
In sqrt(-1) : NaNs produced
This example illustrates this circumstance perfectly because the square root of -1 commonly denoted as i is the very definition of an imaginary number. However, you need to understand that the problem is not with your code but with the data frame or model that you are using.
What is causing this error?
This problem results from the use of negative numbers in the normal distribution of data.
> a = c(10,-2,5,NA,6,-15,20,Inf,22,-Inf,8,0)
> log(a)
[1] 2.302585 NaN 1.609438 NA 1.791759 NaN 2.995732 Inf 3.091042 NaN 2.079442 -Inf
Warning message:
In log(a) : NaNs produced
> sqrt(a)
[1] 3.162278 NaN 2.236068 NA 2.449490 NaN 4.472136 Inf 4.690416 NaN 2.828427 0.000000
Warning message:
In sqrt(a) : NaNs produced
As you can see from the above examples this problem is not being triggered by na values or even infinite values. They can occur in a vector or data frame but it requires a numeric value, and likely a positive value. When a negative number object is put through a log or even root r function the result will be the nan value triggering a warning message. This results from the fact the default functions are not designed to deal with imaginary numbers. As a result, when taking a log or square root of a negative number it returns the nan value.
How to fix this error.
One option is to check the variable for nan values using the is.nan() function which produces a logical vector that can easily be checked.
> a = c(10,-2,5,NA,6,-15,20,Inf,22,-Inf,8,0)
> is.nan(log(a))
[1] FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE
Warning message:
In log(a) : NaNs produced
Note from this example that this does not handle an infinite or missing value. Furthermore, it does not really prevent the warning message. It just creates another set of data points to deal with.
> a = c(10,-2,5,6,-15,20,22,8,0)
> b = a
> for(i in 1:length(a)){
+ if(a[i] > 0) b[i]=sqrt(a[i])
+ if(a[i] less than 0) {
+ c = sqrt(-a[i])
+ e = toString(c)
+ e = paste(substring(e, c(1,nchar(e)+1)), collapse="i")
+ b[i] = e
+ }
+ }
> b
[1] "3.16227766016838" "1.4142135623731i" "2.23606797749979" "2.44948974278318" "3.87298334620742i"
[6] "4.47213595499958" "4.69041575982343" "2.82842712474619" "0"
Here is a real solution to the problem. In this case, we not only get the square roots of each number but we have added the “i” to the end of the imaginary numbers. The one downside is that these are no longer numeric values. The problem with this situation is that the system is not designed to handle imaginary numbers but to get the math correct it is necessary to make the code more complicated than it otherwise would be. For a situation where you need to maintain the numeric values, you could use a two-column matrix where the second column indicates if the number is real or imaginary.
This problem is a result of the fact that the built-in functions are not designed to handle imaginary numbers. It is understandable because numeric values used in computers deal with real numbers and this makes it is harder to write software to handle imaginary numbers. This creates a situation where the individual programmer has to work around these limitations and write their own code to handle it.
Looking for more help with R error messages? Check out these other great articles: