Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Collapsing rows by user with dplyr

Tags:

r

dplyr

I want to collapse the rows based on users while placing the '1' on their corresponding columns.

Each row for each user can only have one '1' so there need not be any adding to the rows following.

My df:

User  +1  +2  +3  +4  +5
   A   1   0   0   0   0
   A   0   1   0   0   0
   A   0   0   0   0   1
   B   0   0   1   0   0 
   B   0   0   0   1   0

Expected result:

User  +1  +2  +3  +4  +5
   A   1   1   0   0   1
   B   0   0   1   1   0 

Any help would be appreciated.

like image 734
ant Avatar asked Feb 03 '15 19:02

ant


2 Answers

Looks like you can use summarise_each:

df %>% group_by(User) %>% summarise_all(funs(sum))

Edit note: replaced summarise_each whicih is now deprecated with summarise_all

like image 116
talat Avatar answered Nov 15 '22 20:11

talat


The way that I would approach this would be to convert your data to long form first, then do the aggregation, and convert back out to wide form if necessary for display purposes.

So, using tidyr,

df %>% gather(rating, count, -User) %>%
  group_by(User, rating) %>%
  summarise(count = max(count)) %>% 
  spread(rating, count)

The first gather converts to long form (using p instead of +):

> df <- read.table(header=TRUE, text='User  p1  p2  p3  p4  p5
   A   1   0   0   0   0
   A   0   1   0   0   0
   A   0   0   0   0   1
   B   0   0   1   0   0 
   B   0   0   0   1   0
')
> df %>% gather(rating, count, -User)
   User rating count
1     A     p1     1
2     A     p1     0
3     A     p1     0
4     B     p1     0
5     B     p1     0
6     A     p2     0
...

And the remaining steps perform the aggregation, then transform back to wide format.

like image 24
user295691 Avatar answered Nov 15 '22 21:11

user295691