I want to collapse the rows based on users while placing the '1' on their corresponding columns.
Each row for each user can only have one '1' so there need not be any adding to the rows following.
My df:
User +1 +2 +3 +4 +5
A 1 0 0 0 0
A 0 1 0 0 0
A 0 0 0 0 1
B 0 0 1 0 0
B 0 0 0 1 0
Expected result:
User +1 +2 +3 +4 +5
A 1 1 0 0 1
B 0 0 1 1 0
Any help would be appreciated.
Looks like you can use summarise_each
:
df %>% group_by(User) %>% summarise_all(funs(sum))
Edit note: replaced summarise_each
whicih is now deprecated with summarise_all
The way that I would approach this would be to convert your data to long form first, then do the aggregation, and convert back out to wide form if necessary for display purposes.
So, using tidyr
,
df %>% gather(rating, count, -User) %>%
group_by(User, rating) %>%
summarise(count = max(count)) %>%
spread(rating, count)
The first gather converts to long form (using p
instead of +
):
> df <- read.table(header=TRUE, text='User p1 p2 p3 p4 p5
A 1 0 0 0 0
A 0 1 0 0 0
A 0 0 0 0 1
B 0 0 1 0 0
B 0 0 0 1 0
')
> df %>% gather(rating, count, -User)
User rating count
1 A p1 1
2 A p1 0
3 A p1 0
4 B p1 0
5 B p1 0
6 A p2 0
...
And the remaining steps perform the aggregation, then transform back to wide format.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With