How to eliminate "NA/NaN/Inf in foreign function call (arg 7)" running predict with randomForest

Question

I have researched this extensively without finding a solution. I have cleaned my data set as follows:

library("raster")
impute.mean <- function(x) replace(x, is.na(x) | is.nan(x) | is.infinite(x) , 
mean(x, na.rm = TRUE))
losses <- apply(losses, 2, impute.mean)
colSums(is.na(losses))
isinf <- function(x) (NA <- is.infinite(x))
infout <- apply(losses, 2, is.infinite)
colSums(infout)
isnan <- function(x) (NA <- is.nan(x))
nanout <- apply(losses, 2, is.nan)
colSums(nanout)

The problem arises running the predict algorithm:

options(warn=2)
p  <-   predict(default.rf, losses, type="prob", inf.rm = TRUE, na.rm=TRUE, nan.rm=TRUE)

All the research says it should be NA's or Inf's or NaN's in the data but I don't find any. I am making the data and the randomForest summary available for sleuthing at [deleted] Traceback doesn't reveal much (to me anyway):

4: .C("classForest", mdim = as.integer(mdim), ntest = as.integer(ntest), 
       nclass = as.integer(object$forest$nclass), maxcat = as.integer(maxcat), 
       nrnodes = as.integer(nrnodes), jbt = as.integer(ntree), xts = as.double(x), 
       xbestsplit = as.double(object$forest$xbestsplit), pid = object$forest$pid, 
       cutoff = as.double(cutoff), countts = as.double(countts), 
       treemap = as.integer(aperm(object$forest$treemap, c(2, 1, 
           3))), nodestatus = as.integer(object$forest$nodestatus), 
       cat = as.integer(object$forest$ncat), nodepred = as.integer(object$forest$nodepred), 
       treepred = as.integer(treepred), jet = as.integer(numeric(ntest)), 
       bestvar = as.integer(object$forest$bestvar), nodexts = as.integer(nodexts), 
       ndbigtree = as.integer(object$forest$ndbigtree), predict.all = as.integer(predict.all), 
       prox = as.integer(proximity), proxmatrix = as.double(proxmatrix), 
       nodes = as.integer(nodes), DUP = FALSE, PACKAGE = "randomForest")
3: predict.randomForest(default.rf, losses, type = "prob", inf.rm = TRUE, 
       na.rm = TRUE, nan.rm = TRUE)
2: predict(default.rf, losses, type = "prob", inf.rm = TRUE, na.rm = TRUE, 
       nan.rm = TRUE)
1: predict(default.rf, losses, type = "prob", inf.rm = TRUE, na.rm = TRUE, 
       nan.rm = TRUE)

Nate Pope · Accepted Answer

Your code is not entirely reproducible (there's no running of the actual randomForest algorithm) but you are not replacing Inf values with the means of column vectors. This is because the na.rm = TRUE argument in the call to mean() within your impute.mean function does exactly what it says -- removes NA values (and not Inf ones).

You can see this, for example, by:

impute.mean <- function(x) replace(x, is.na(x) | is.nan(x) | is.infinite(x), mean(x, na.rm = TRUE))
losses <- apply(losses, 2, impute.mean)
sum( apply( losses, 2, function(.) sum(is.infinite(.))) )
# [1] 696

To get rid of infinite values, use:

impute.mean <- function(x) replace(x, is.na(x) | is.nan(x) | is.infinite(x), mean(x[!is.na(x) & !is.nan(x) & !is.infinite(x)]))
losses <- apply(losses, 2, impute.mean)
sum(apply( losses, 2, function(.) sum(is.infinite(.)) ))
# [1] 0

Sam Firke · Answer

One cause of the error message:

NA/NaN/Inf in foreign function call (arg X)

When training a randomForest is having character-class variables in your data.frame. If it comes with the warning:

NAs introduced by coercion

Check to make sure that all of your character variables have been converted to factors.

Example

set.seed(1)
dat <- data.frame(
  a = runif(100),
  b = rpois(100, 10),
  c = rep(c("a","b"), 100),
  stringsAsFactors = FALSE
)

library(randomForest)
randomForest(a ~ ., data = dat)

Yields:

Error in randomForest.default(m, y, ...) : NA/NaN/Inf in foreign function call (arg 1) In addition: Warning message: In data.matrix(x) : NAs introduced by coercion

But switch it to stringsAsFactors = TRUE and it runs.

How to eliminate "NA/NaN/Inf in foreign function call (arg 7)" running predict with randomForest

Tags:

r

runtime-error

random-forest

predict

Elliott

2 Answers

Nate Pope

Sam Firke

Recent Activity

Donate For Us

How to eliminate "NA/NaN/Inf in foreign function call (arg 7)" running predict with randomForest

Tags:

r

runtime-error

random-forest

predict

Elliott

2 Answers

Nate Pope

Sam Firke

Related questions

Recent Activity

Donate For Us