Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why apply() does not work on my dataframe in R?

Tags:

r

apply

I have a dataframe named "adult"

> str(adult[, 1:2)
'data.frame':   32561 obs. of  15 variables:
 $ age      : int  39 50 38 53 28 37 49 52 31 42 ...
 $ worktp   : Factor w/ 9 levels " ?"," Federal-gov",..: 8 7 5 5 5 5 5 7 5 5 ...

> is.factor(adult[,1])
[1] FALSE

> is.factor(adult[,2])
[1] TRUE

Everything works well until I use

> apply(adult[,1:2], 2, function(x) is.factor(x))
age worktp 
FALSE  FALSE 

Why I got FALSE on worktp where is.factor() just gave me TRUE? I really need this apply() function to work on my dataframe. Should I use some other apply related functions?

Thanks!

like image 453
wen Avatar asked Jan 14 '14 09:01

wen


People also ask

How do I apply a Dataframe in R?

In R Programming Language to apply a function to every integer type value in a data frame, we can use lapply function from dplyr package. And if the datatype of values is string then we can use paste() with lapply.

What does apply () do in R?

The apply() function lets us apply a function to the rows or columns of a matrix or data frame. This function takes matrix or data frame as an argument along with function and whether it has to be applied by row or column and returns the result in the form of a vector or array or list of values obtained.

Which function can be used to create a Dataframe in R?

We can create a data frame using the data. frame() function. For example, the above shown data frame can be created as follows. Notice above that the third column, Name is of type factor, instead of a character vector.


1 Answers

apply will convert your data into a matrix before processing it (see Details section in ?apply). The factor information is lost during this step.

d <- data.frame(num=1:4, fac=factor(1:4))
d[, 2]
[1] 1 2 3 4
Levels: 1 2 3 4        # levels, hence a factor

m <- as.matrix(d)
m[, 2]
[1] "1" "2" "3" "4"     # no levels anymore

apply(d, 2, is.factor)

  num   fac 
FALSE FALSE             # no factors as converted to matrix

To get what you want you could use lapply

lapply(d, is.factor)
$num
[1] FALSE

$fac
[1] TRUE

or sapply

sapply(d, is.factor)
  num   fac 
FALSE  TRUE 
like image 150
Mark Heckmann Avatar answered Oct 07 '22 00:10

Mark Heckmann