Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does dplyr’s between work?

Tags:

I’ve read the documentation and I’ve tried googling it; it should be a simple thing, but it would seem it’s not to me; so I boldly go forth and ask if someone here could explain me how dplyr’s between() works.

# Explanation documentation between(x, left, right)  x            A numeric vector of values left, right: Boundary values 

I understand a vector is a one-dimensional array, so I suppose c(1:7) is a vector, right? I tried using the example provided in the documentation as a template to search for flights july–september, but the following just returns an error:

# Example from documentation cont’d x <- rnorm(1e2) x[between(x, -1, 1)]  # Loading the library library(nycflights13)  # Execute my hopeless attempt at between() flights[between(month, 7, 9)]  # Output and error message > flights[between(month, 7, 9)] Error in between(month, 7, 9) : object 'month' not found 

I feel really daft asking this, but any help in understanding this will be greatly appreciated. I would also apologise for not asking a well-defined question; as is probably appreciated, I really don’t know how to phrase it other than ‘I don’t get it’.

like image 407
Canned Man Avatar asked Oct 12 '16 11:10

Canned Man


2 Answers

between is nothing special — any other function in R would have led to the same problem. Your confusion stems from the fact that dplyr has a lot of functions that allow you to work on data.frame column names as if they were normal variables; for instance:

filter(flights, month > 9) 

However, between is not one of these functions. As mentioned, it’s simply a normal function. So if you want to use it, you need to provide arguments in the conventional way; for instance:

between(flights$month, 7, 9) 

This will return a logical vector, and you can now use it to index your data.frame:

flights[between(flights$month, 7, 9), ] 

Or, more dplyr-like:

flights %>% filter(between(month, 7, 9)) 

Note that here we now use non-standard evaluation. But the evaluation is performed by filter, not by between. between is called (by filter) using standard evaluation.

like image 191
Konrad Rudolph Avatar answered Oct 05 '22 05:10

Konrad Rudolph


I guess you want it like this:

library(nycflights13) library(dplyr)  flights %>% filter(between(month,7,9)) 

I see in the meantime this solution also appeared in the comments.

like image 41
Wietze314 Avatar answered Oct 05 '22 07:10

Wietze314