Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does apply convert logicals in data frames to strings of 5 characters?

Tags:

dataframe

r

apply

Suppose I have a data frame:

mydf <- data.frame(colA = c(1,20), colB = c("a", "ab"), colC = c(T, F))

Now suppose I want to apply a function to each row on the data frame. This function uses the boolean value of column C. When using apply, every non-string is converted to a string of the maximum length present in the column:

> apply(mydf, 1, '[', 3)
[1] " TRUE" "FALSE"

The string " TRUE" is no longer interpretable as a logical.

> ifelse(apply(mydf, 1, '[', 3), 1, 2)
[1] NA  2

I could solve this with a gsub(" ", "", x), but I'd bet there is a better way. Why does apply have this behavior when it could just directly convert the logicals to strings? Is there an apply-like function which does not have the above behavior?

like image 442
Will Beason Avatar asked Sep 15 '14 18:09

Will Beason


1 Answers

When you called apply, your data frame was converted to a character matrix. The spaces appear because each element is converted to the width of the widest element in the column.

You can do it with a for loop-like sapply call

> ( s <- sapply(seq(nrow(mydf)), function(i) mydf[i, 3]) )
# [1]  TRUE FALSE
> class(s)
# [1] "logical"

A workaround to what you are doing with apply would be

> as.logical(gsub("\\s+", "", apply(mydf, 1, `[`, 3)))
# [1]  TRUE FALSE

But note that these are both exactly the same as

> mydf[,3]
# [1]  TRUE FALSE
like image 112
Rich Scriven Avatar answered Oct 20 '22 01:10

Rich Scriven