Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum in data.table grouping with condition

Tags:

r

data.table

I'm not use to work with data.table and need some help for doing this type of operations

My data :

library(data.table)

x = c(rep('a', 3), rep('b', 4), 'c')

y = c(1, 2, 1, 4, 4, 2, 4, 5)

dt = data.frame(x , y)

My operation : I want to groupby the x variable, and sum on unique value of y

setDT(dt)[, sm := sum(y), by = list(x)]

The output is :

   x y sm
1: a 1  4
2: a 2  4
3: a 1  4
4: b 4 14
5: b 4 14
6: b 2 14
7: b 4 14
8: c 5  5

But I want :

   x y sm
1: a 1  3
2: a 2  3
3: a 1  3
4: b 4  6
5: b 4  6
6: b 2  6
7: b 4  6
8: c 5  5

I probably have to use the .SD but I dont know how !

Thanks for help

like image 654
Mostafa790 Avatar asked Jun 09 '26 02:06

Mostafa790


2 Answers

You could sum unique values.

library(data.table)
setDT(dt)[, sm := sum(unique(y)), x]
dt

#   x y sm
#1: a 1  3
#2: a 2  3
#3: a 1  3
#4: b 4  6
#5: b 4  6
#6: b 2  6
#7: b 4  6
#8: c 5  5
like image 181
Ronak Shah Avatar answered Jun 11 '26 15:06

Ronak Shah


One option could be:

setDT(dt)[, sm := sum(y[!duplicated(y)]), by = x]

   x y sm
1: a 1  3
2: a 2  3
3: a 1  3
4: b 4  6
5: b 4  6
6: b 2  6
7: b 4  6
8: c 5  5
like image 45
tmfmnk Avatar answered Jun 11 '26 16:06

tmfmnk



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!