How to fix error in do_one(nmeth) : na/nan/inf in foreign function call (arg 1)

If you are getting this error message, it is because you are using non-numeric values to form a cluster with the k means function. This results from your data frame having a missing value or a column of characters. In both cases, fixing the problem requires removing the non-numeric values from the relevant variable (column).

Description of the error

The k means algorithm is similar to the knn and arg algorithms in that it is a learning algorithm. It is however an unsupervised learning algorithm used for clustering problems. The data frame you are using can only have numeric values, otherwise, you will get our error message. The norm for the k means is a lengthy output consisting of cluster means, a clustering vector, and a lot more. Our error message occurs if the data frame has na values or a column consisting of characters. When the content contains characters, you will get an additional warning message.

Explanation of the error

Here are two code examples that show how this problem occurs.

> df = data.frame(A = c(1,2,6,3,4,5),
+ B = c(2,4,NA,6,8,10),
+ C = c(3,6,7,9,12,15))
> kmeans(df, centers = 2)
Error in do_one(nmeth) : NA/NaN/Inf in foreign function call (arg 1)

In this first example, we have a single NA value. This is what causes the problem.

> df = data.frame(L = c(“A”,”B”,”C”,”D”,”E”),
+ A = c(1,2,3,4,5),
+ B = c(2,4,6,8,10),
+ C = c(3,6,9,12,15))
> kmeans(df, centers = 2)
Error in do_one(nmeth) : NA/NaN/Inf in foreign function call (arg 1)

In this example, the data frame has a column of letters, and this is what causes the problem.

How to fix the error

Here are two examples on how to fix this problem. They parallel the examples in the previous section.

> df = data.frame(A = c(1,2,6,3,4,5),
+ B = c(2,4,NA,6,8,10),
+ C = c(3,6,7,9,12,15))
> df = na.omit(df)
> kmeans(df, centers = 2)
K-means clustering with 2 clusters of sizes 3, 2

In this example, we remove any rows that contain missing values using the na.omit function.

> df = data.frame(L = c(“A”,”B”,”C”,”D”,”E”),
+ A = c(1,2,3,4,5),
+ B = c(2,4,6,8,10),
+ C = c(3,6,9,12,15))
>
> df = subset(df, select = -L)
> kmeans(df, centers = 2)
K-means clustering with 2 clusters of sizes 3, 2

In this example, we simply remove the character column by using the subset function. If the characters are numbers, you could use a coercion function to convert them into numeric values.

This message is easy to get when you do not have control over the content you are working with. The way to fix the problem is to remove the values that are causing it. While you may not be able to always avoid this problem, it is an easy one to fix once you understand it.

Scroll to top