Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr: add rows within group_by groups

Tags:

r

dplyr

grouping

Is there a better way to add rows within group_by() groups than using bind_rows()? Here's an example that's a little clunky:

df <- data.frame(a=c(1,1,1,2,2), b=1:5)

df %>%
  group_by(a) %>%
  do(bind_rows(data.frame(a=.$a[1], b=0), ., data.frame(a=.$a[1], b=10)))

The idea is that columns that we're already grouping on could be inferred from the groups.

I was wondering whether something like this could work instead:

df %>%
  group_by(a) %>%
  insert(b=0, .at=0) %>%
  insert(b=10)

Like append(), it could default to inserting after all existing elements, and it could be smart enough to use group values for any columns unspecified. Maybe use NA for non-grouping columns unspecified.

Is there an existing convenient syntax I've missed, or would this be helpful?

like image 978
Ken Williams Avatar asked Dec 31 '15 21:12

Ken Williams


1 Answers

Here's an approach using data.table:

library(data.table)
setDT(df)

rbind(df, expand.grid(b = c(0, 10), a = df[ , unique(a)]))[order(a, b)]

Depending on your actual context this much simpler alternative would work too:

df[ , .(b = c(0, b, 10)), by = a]

(and we can simply use c(0, b, 10) in j if we don't care about keeping the name b)

The former has the advantage that it will work even if df has more columns -- just have to set fill = TRUE for rbind.data.table.

like image 184
MichaelChirico Avatar answered Oct 17 '22 07:10

MichaelChirico