Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr: use chaining to pass variables

I'm new to dplyr and cannot figure out how to control the variables to pass through a chaining (%>%) command. Simple example: the str_sub function takes three arguments - the first is passed on through %>% but how can I get the last two? :

library(stringr)
library(dplyr)
df <- data.frame(V1 = c("ABBEDHH", "DEFGH", "EFGF", "EEFD"), 
                 V2=c(4, 2, 1, 1), V3=c(5, 2, 2, 1), stringsAsFactors=FALSE)

In base R I could do:

with(df, str_sub(V1, V2, V3))

and get:

## [1] "ED" "E"  "EF" "E" 

How to chain this ? - I tried:

df %>% str_sub(V1, V2, V3) # Here V3 is unused arg since V1 is treated as 2nd arg

df %>% select(V1) %>% str_sub(V2, V3) # Here V2 and V3 are not recognized
like image 619
user3375672 Avatar asked Nov 03 '14 13:11

user3375672


2 Answers

You can do the following:

library(dplyr)
library(stringr)
library(lazyeval)

df %>% mutate(new = str_sub(V1, V2, V3))
#       V1 V2 V3 new
#1 ABBEDHH  4  5  ED
#2   DEFGH  2  2   E
#3    EFGF  1  2  EF
#4    EEFD  1  1   E

Note that dplyr is made for working with data.frames, so input and output should be data.frames, not atomic vectors.

like image 188
talat Avatar answered Oct 28 '22 01:10

talat


One could also do:

df %>% with(str_sub(V1, V2, V3))

as you want a vector anyway. But now we're back in nested land.

like image 36
Tyler Rinker Avatar answered Oct 28 '22 00:10

Tyler Rinker