Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the equivalent of SQL's IN keyword in R?

In SQL, you can easily avoid multiple OR conditions if you're looking for many values of a particular variable (column) by using IN. For example :

SELECT * FROM colors WHERE color in ('Red', 'Blue', 'Green')

How would I do that in R? I am currently having to do it like this:

shortlisted_colors <- subset(colors, color == 'Red' | color == 'Blue' | color == 'Green')

What is a better way?

like image 401
user3422637 Avatar asked Jul 17 '14 19:07

user3422637


2 Answers

shortlisted_colors <- subset(colors, color %in% c('Red', 'Blue', 'Green'))
like image 124
nrussell Avatar answered Nov 04 '22 12:11

nrussell


I suppose it might be difficult to search on "in" but the answer is "%in%". Searching also might be difficult because in is a reserved word in R because of its use in the iterator specification in for-loops:

subset(colors, color %in% c('Red' ,'Blue','Green') )

See:

?match
?'%in%'   # since you need to quote names with special symbols in them

The use of "%"-signs to enclose user-defined infix function names is illustrated on that page, but you will then get a leg up on understanding how @hadley has raised that approach to a much higher level in his dplyr-package. If you have a solid background in SQL then looping back to see what dplyr offers should be very satisfying. I understand that dplyr-functions are really a front-end to SQL operations in many instances.

like image 27
IRTFM Avatar answered Nov 04 '22 10:11

IRTFM