Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combining tapply and 'not in' logic, using R

Tags:

r

tapply

notin

How do I combine the tapply command with 'not in' logic?

Objective: Obtain the median sepal length for each species.

tapply(iris$Sepal.Length, iris$Species, median)

Constraint: Remove entries for which there is a petal width of 1.3 and 1.5.

!iris$Petal.Width %in% c('1.3', '1.5')

Attempt:

tapply(iris$Sepal.Length, iris$Species, median[!iris$Petal.Width %in% c('1.3', '1.5')])

Result: error message 'object of type 'closure' is not subsettable'.

---

My attempt here with the iris dataset is a stand-in demo for my own dataset. I have attempted the same approach with my own dataset and received the same error message. I imagine something is wrong with my syntax. What is it?

like image 705
bubbalouie Avatar asked May 11 '15 21:05

bubbalouie


2 Answers

Try

with(iris[!iris$Petal.Width %in% c('1.3', '1.5'),], tapply(Sepal.Length, Species, median))
# setosa versicolor  virginica 
#    5.0        5.8        6.5 

The idea here is to operate on the subset-ted data in the first place.

Your line didn't work because the FUN argument should be applied on X (Sepal.Length in your case) rather over the whole data set.

like image 195
David Arenburg Avatar answered Nov 03 '22 08:11

David Arenburg


This is the workaround you should not do:

tapply(
  1:nrow(iris),
  iris$Species,
  function(i) median(iris$Sepal.Length[
     (1:nrow(iris) %in% i) &
    !(iris$Petal.Width %in% c('1.3', '1.5'))
]))

Things get ugly if you subset after splitting the vector in this way. You effectively have to

  • split it again (when using 1:nrow(iris) %in% i) and
  • compute the subset once for each value of iris$Species (when using !(iris$Petal.Width %in% c('1.3', '1.5'))).
like image 30
Frank Avatar answered Nov 03 '22 09:11

Frank