How can I subset rows from a data frame if rows in a given column are blank or NA. For example:
x <- c(1,2,3,4,"","","")
y <- c("A","B","C","D","E","F","G")
z <- c(100,200,300,400,500,600,700)
xyz <- data.frame(x,y,z)
View(xyz)

g1 <- subset(xyz, subset=(x > 0))
Returns:
Warning message: In Ops.factor(x, 0) : > not meaningful for factors
How can I get it to return a new data frame that's a subset of the original but only containing rows where X column is greater than zero?
When you created your data frame, you specified that x should be a factor variable.
(Technically you specified that it should be character, but data.frame has read your mind and converted it to factor for you. Again, technically you didn't specify that it should be character, but R has read your mind and, because you tried to combine numbers and characters in the one vector, it's coerced them all into a vector of character mode.)
Because of this, "greater than zero" doesn't make sense as a comparison operator in this context. I'm going to read your mind and conclude that you actually want x to be numeric, but with an allowance for situations where the value is not available. In that case, you should do
xyz$x <- as.numeric(as.character(xyz$x))
subset(xyz, !is.na(x))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With