The random forest algorithm is an extremely useful tool for classifying the given input of a dataset through regression plotting and randomness clustering to output a listed prediction data frame. When using the program, you should keep in mind that the required formating for inputs is murky at best and you will soon meet a fair number of warnings.

One, in particular, is an error reading random forest. default(m, y, …) : na/nan/inf in foreign function call (arg 1). This exclamation can be particularly tricky as it has a number of sources for not reading properly. The function can only read character variables that have clear assigned values, like cookies with unique text data files stored for different identifications. When the r function encounters a missing value, there’s no value to assign in the column vectors and so the algorithm cannot compute the dataset.

For a dataset that contains infinite values, you may have to use another programming tool to compute the randomness matrix, but staying the course with a random forest only means putting a filter for the problem values in your data.

Here’s one example that creates normal data frame:

set.seed(19880303) # Setting the seed to my birthday

library(data.table) # install.packages(‘data.table’) if necessary

norms list()

for (i in 1:10) {

norms[[i]]

}

dt rbindlist(norms) # binding it all together in a data table

rm(norms) # remove the list

Now let’s identify all the inf values:

dt[!complete.cases(dt)]

Then we can convert the character columns to fully highlight

for (i in 1:ncol(dt))

if (typeof(dt[[i]]) == ‘character’)

dt[[i]] as.factor(dt[[i]])

and that should make the previously missing values understandable to the function and able to classify.

As a simpler means to an end, you could just remove the values entirely if they don’t hold much importance to your finished outcome.

When inputting a command for random forest to work on your library of values,

library(randomForest)

df data.frame(y c(30, 29, 30, 45, 23, 19, 9, 8, 11, 14),

x1 c(‘A’, ‘A’, ‘B’, ‘B’, ‘B’, ‘B’, ‘C’, ‘C’, ‘C’, ‘C’),

x2 c(4, 4, 5, 7, 8, 7, 9, 6, 13, 15))

model randomForest(formula = y ~ ., data = df)

Error in randomForest.default(m, y, …) :

NA/NaN/Inf in foreign function call (arg 1)

then you can simply tell the algorithm to transmute the column characters into factors.

library(dplyr)

df = df %>% mutate_if(is.character, as.factor)

Now the model matrix can read the library without errors.

model randomForest(formula = y ~ ., data = df)

Call:

randomForest(formula = y ~ ., data = df)

Type of random forest: regression

Number of trees: 500

No. of variables tried at each split: 1

Mean of squared residuals: 65.0047

% Var explained: 48.64

Moving forward in using R and its randomness matrix tools, it helps to recognize problem values in your datasets ahead of time and plan to use tools to identify and correct them to allow full utilization of the decision tree matrix for your regression plotting projects.