has anyone been able to set up a classification (not a regressions) using randomForest AND the bigmemory library. I am aware that the 'formula approach" cannot be used and we have to resort to the "x=predictors, y=response approach". It appears that the big memory library is unable to deal with a response vector that has categorical values (its a matrix, after all). In my case, I have two levels, both represented as characters.
According to the bigmemory documentation..."A data frame will have character vectors converted to factors, and then all factors converted to numeric factor levels"
Any suggested workarounds to get randomForest classification to work with bigmemory?
#EXAMPLE to problem
library(randomForest)
library(bigmemory)
# Removing any extra objects from my workspace (just in case)
rm(list=ls())
#first small matrix
small.mat <- matrix(sample(0:1,5000,replace = TRUE),1000,5)
colnames(small.mat) <- paste("V",1:5,sep = "")
small.mat[,5] <- as.factor(small.mat[,5])
small.rf <- randomForest(V5 ~ .,data = small.mat, mtry=2, do.trace=100)
print(small.rf)
small.result <- matrix(0,1000,1)
small.result <- predict(small.rf, data=small.mat[,-5])
#now small dataframe Works!
small.mat <- matrix(sample(0:1,5000,replace = TRUE),1000,5)
colnames(small.mat) <- paste("V",1:5,sep = "")
small.data <- as.data.frame(small.mat)
small.data[,5] <- as.factor(small.data[,5])
small.rf <- randomForest(V5 ~ .,data = small.data, mtry=2, do.trace=100)
print(small.rf)
small.result <- matrix(0,1000,1)
small.result <- predict(small.rf, data=small.data[,-5])
#then big matrix Classification Does NOT Work :-(
#----------------****************************----
big.mat <- as.big.matrix(small.mat, type = "integer")
#Line below throws error, "cannot coerce class 'structure("big.matrix", package = "bigmemory")' into a data.frame"
big.rf <- randomForest(V5~.,data = big.mat, do.trace=10)
#Runs without error but only regression
big.rf <- randomForest(x = big.mat[,-5], y = big.mat[,5], mtry=2, do.trace=100)
print(big.rf)
big.result <- matrix(0,1000,1)
big.result <- predict(big.rf, data=big.mat[,-5])
bigrf package may help. Currently, it supports classification with a limited number of features.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With