In SQL, you can easily avoid multiple OR conditions if you're looking for many values of a particular variable (column) by using IN. For example :
SELECT * FROM colors WHERE color in ('Red', 'Blue', 'Green')
How would I do that in R? I am currently having to do it like this:
shortlisted_colors <- subset(colors, color == 'Red' | color == 'Blue' | color == 'Green')
What is a better way?
shortlisted_colors <- subset(colors, color %in% c('Red', 'Blue', 'Green'))
I suppose it might be difficult to search on "in" but the answer is "%in%". Searching also might be difficult because in
is a reserved word in R because of its use in the iterator specification in for
-loops:
subset(colors, color %in% c('Red' ,'Blue','Green') )
See:
?match
?'%in%' # since you need to quote names with special symbols in them
The use of "%"-signs to enclose user-defined infix function names is illustrated on that page, but you will then get a leg up on understanding how @hadley has raised that approach to a much higher level in his dplyr
-package. If you have a solid background in SQL then looping back to see what dplyr offers should be very satisfying. I understand that dplyr
-functions are really a front-end to SQL operations in many instances.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With