Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does sapply() return a list?

Tags:

r

I faced up a strange behaviour in R with the sapply() function. This function is supposed to return a vector, but in the special case where you give it an empty vector, it returns a list.

Correct behaviour with a vector:

a = c("A", "B", "C")
a[a == "B"]  # Returns "B"
a[sapply(a, function(x) {x == "B"})] # Returns "B"

Correct behaviour with a NULL value:

a = NULL
a[a == "B"]  # Returns NULL
a[sapply(a, function(x) {x == "B"})] # Returns NULL

Strange behaviour with an empty vector:

a = vector()
a[a == "B"]  # Returns NULL
a[sapply(a, function(x) {x == "B"})] # Erreur : type 'list' d'indice incorrect

Same error message as with this statement:

a[list()] # Erreur dans a[list()] : type 'list' d'indice incorrect

Why? Is it a bug?

Due to this strange behaviour, I use unlist(lapply()).

like image 940
Bix Avatar asked Mar 12 '12 14:03

Bix


2 Answers

The real reason for this is that sapply doesn't know what your function will return without calling it. In your case the function returns a logical, but since sapply is given an empty list, the function is never called. Therefore, it has to come up with a type and it defaults to list.

...For this very reason (and for performance), vapply was introduced! It requires you to specify the return value type (and length). This allows it to do the right thing. As a bonus, it is also faster!

sapply(LETTERS[1:3], function(x) {x == "B"}) # F, T, F
sapply(LETTERS[0], function(x) {x == "B"})   # list()

vapply(LETTERS[1:3], function(x) {x == "B"}, logical(1)) # F, T, F
vapply(LETTERS[0], function(x) {x == "B"}, logical(1))   # logical()

See ?vapply for more info.

like image 59
Tommy Avatar answered Sep 17 '22 15:09

Tommy


The help for the function ?sapply has this in the Value section

For ‘sapply(simplify = TRUE)’ and ‘replicate(simplify = TRUE)’: if
‘X’ has length zero or ‘n = 0’, an empty list.

In both your cases:

> length(NULL)
[1] 0
> length(vector())
[1] 0

Hence sapply() returns:

> sapply(vector(), function(x) {x == "B"})
list()
> sapply(NULL, function(x) {x == "B"})
list()

Your error is not from sapply() but from [ as this shows:

> a[list()]
Error in a[list()] : invalid subscript type 'list'

So the issue is related to how subsetting of NULL and an empty vector (vector()) is performed. Nothing to do with sapply() at all. In both cases it returns consistent output, an empty list.

like image 22
Gavin Simpson Avatar answered Sep 21 '22 15:09

Gavin Simpson