I'm trying to separate a dataset into parts that have factor variables and non-factor variables.
I'm looking to do something like:
This part works:
factorCols <- sapply(df1, is.factor)
factorDf <- df1[,factorCols]
This part won't work:
nonFactorCols <- sapply(df1, !is.factor)
due to this error:
Error in !is.factor : invalid argument type
Is there a correct way to do this?
The sapply in R is a built-in function that applies a function to all the input elements. The sapply() method takes a list, vector, or data frame as an argument and returns a vector or matrix. The sapply() is an R wrapper class to lapply, with the difference being it returns a vector or matrix instead of a list object.
In R, factors are used to work with categorical variables, variables that have a fixed and known set of possible values. They are also useful when you want to display character vectors in a non-alphabetical order. Historically, factors were much easier to work with than characters.
Creating a Factor in R Programming Language The command used to create or modify a factor in R language is – factor() with a vector as input. The two steps to creating a factor are: Creating a vector. Converting the vector created into a factor using function factor()
Correct way:
nonFactorCols <- sapply(df1, function(col) !is.factor(col))
# or, more efficiently
nonFactorCols <- !sapply(df1, is.factor)
# or, even more efficiently
nonFactorCols <- !factorCols
Joshua gave you the correct way to do it. As for why sapply(df1, !is.factor)
did not work:
sapply
is expecting a function. !is.factor
is not a function. The bang operator returns a logical value (albeit, it cannot take is.factor
as an argument).
Alternatively, you could use Negate(is.factor)
which does in fact return a function.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With