I have the following problem, which has probably a pretty simple solution: When I use <pre class="prettyprint"><code>library (data.table) actions = data.table(User_id = c("Carl","Carl","Carl","Lisa","Moe"), category = c(1,1,2,2,1), value= c(10,20,30,40,50)) User_id category value 1: Carl 1 10 2: Carl 1 20 3: Carl 2 30 4: Lisa 2 40 5: Moe 1 50 actions[category==1,sum(value),by= User_id] </code></pre> The problem is, that apparently it first sorts out the rows where category is 1 and then uses the by command. So what I get is: <pre class="prettyprint"><code> User_id V1 1: Carl 30 2: Moe 50 </code></pre> But what I want is: <pre class="prettyprint"><code> User_id V1 1: Carl 30 2: Lisa 0 3: Moe 50 </code></pre> I am building a data.table just containing information about the users, so: <pre class="prettyprint"><code>users = actions[,User_id,by= User_id] users$value_one = actions[category==1,.(value_one =sum(value)),by= User_id]$value_one </code></pre> which throws errors or includes wrong values, when there are some users that have no entry.

This is almost as succinct, and gets the job done. <pre class="prettyprint"><code>actions[, .SD[category==1, sum(value)], by=User_id] # User_id V1 # 1: Carl 30 # 2: Lisa 0 # 3: Moe 50 ## Or, better yet, no need to muck around with .SD, (h.t. David Arenburg) actions[, sum(value[category == 1]), by = User_id] # User_id V1 # 1: Carl 30 # 2: Lisa 0 # 3: Moe 50 </code></pre> If the relative inefficiency of the above is a problem in your use case, here's a more efficient alternative: <pre class="prettyprint"><code>res <- actions[, .(val=0), by=User_id] res[actions[category==1, .(val=sum(value)), by=User_id], val:=i.val, on="User_id"] res # User_id val # 1: Carl 30 # 2: Lisa 0 # 3: Moe 50 </code></pre>

data.table WHERE before BY

Tags:

r

data.table

I have the following problem, which has probably a pretty simple solution: When I use

Click to copy

library (data.table)
actions = data.table(User_id = c("Carl","Carl","Carl","Lisa","Moe"),
                     category = c(1,1,2,2,1),
                     value= c(10,20,30,40,50))

   User_id category value
1:    Carl        1    10
2:    Carl        1    20
3:    Carl        2    30
4:    Lisa        2    40
5:     Moe        1    50

actions[category==1,sum(value),by= User_id]

The problem is, that apparently it first sorts out the rows where category is 1 and then uses the by command. So what I get is:

Click to copy

   User_id V1
1:    Carl 30
2:     Moe 50

But what I want is:

Click to copy

   User_id V1
1:    Carl 30
2:    Lisa 0
3:     Moe 50

I am building a data.table just containing information about the users, so:

Click to copy

users = actions[,User_id,by= User_id]
users$value_one = actions[category==1,.(value_one =sum(value)),by= User_id]$value_one

which throws errors or includes wrong values, when there are some users that have no entry.

605

asked May 18 '16 15:05

Marvins.seins

1 Answers

This is almost as succinct, and gets the job done.

Click to copy

actions[, .SD[category==1, sum(value)], by=User_id]
#    User_id V1
# 1:    Carl 30
# 2:    Lisa  0
# 3:     Moe 50

## Or, better yet, no need to muck around with .SD, (h.t. David Arenburg)
actions[, sum(value[category == 1]), by = User_id]
#    User_id V1
# 1:    Carl 30
# 2:    Lisa  0
# 3:     Moe 50

If the relative inefficiency of the above is a problem in your use case, here's a more efficient alternative:

Click to copy

res <- actions[, .(val=0), by=User_id]
res[actions[category==1, .(val=sum(value)), by=User_id], val:=i.val, on="User_id"]    
res
#    User_id val
# 1:    Carl  30
# 2:    Lisa   0
# 3:     Moe  50

121

answered Sep 18 '22 13:09

Josh O'Brien

Related questions
                            
                                I can't generate \label{fig:mwe-plot} with knitr
                            
                                Dodging points and error bars with ggplot
                            
                                How to end a header 3 box in rmarkdown beamer madrid presentation?
                            
                                NA in clustering functions (kmeans, pam, clara). How to associate clusters to original data?
                            
                                R: ggvis - gray background (as ggplot2)
                            
                                ggplot2 boxplot medians aren't plotting as expected
                            
                                create an empty list to fill it up with lists in R
                            
                                Fit model by group using Data.Table package
                            
                                Control size of figure in Rstudio presentation
                            
                                Shiny - All sub-lists in "choices" must be named?
                            
                                Using filtered datatables in shiny
                            
                                R data.table column names not working within a function
                            
                                issue saving R plot with transparent background
                            
                                Test if variable is empty in R
                            
                                Remove white space between plots and table in grid.arrange
                            
                                How to tell what method is being used by a function call when `methods` fails?
                            
                                dygraph in R multiple plots at once
                            
                                R: strsplit on backslash (\)
                            
                                Label next to selectInput in shiny
                            
                                subset parameter in layers is no longer working with ggplot2 >= 2.0.0

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

data.table WHERE before BY

Tags:

r

data.table

Marvins.seins

People also ask

1 Answers

Josh O'Brien

Recent Activity

Donate For Us