while using dplyr i'm having trouble changing the last value my data frame. i want to group by user and tag and change the Time to 0 for the last value / row in the group. <pre class="prettyprint"><code> user_id tag Time 1 268096674 1 3 2 268096674 1 10 3 268096674 1 1 4 268096674 1 0 5 268096674 1 9999 6 268096674 2 0 7 268096674 2 9 8 268096674 2 500 9 268096674 3 0 10 268096674 3 1 ... </code></pre> Desired output: <pre class="prettyprint"><code> user_id tag Time 1 268096674 1 3 2 268096674 1 10 3 268096674 1 1 4 268096674 1 0 5 268096674 1 0 6 268096674 2 0 7 268096674 2 9 8 268096674 2 0 9 268096674 3 0 10 268096674 3 1 ... </code></pre> I've tried to do something like this, among others and can't figure it out: <pre class="prettyprint"><code>df %>% group_by(user_id,tag) %>% mutate(tail(Time) <- 0) </code></pre> I tried adding a row number as well, but couldn't quite put it all together. any help would be appreciated.

Here's an option: <pre class="prettyprint"><code>df %>% group_by(user_id, tag) %>% mutate(Time = c(Time[-n()], 0)) #Source: local data frame [10 x 3] #Groups: user_id, tag # # user_id tag Time #1 268096674 1 3 #2 268096674 1 10 #3 268096674 1 1 #4 268096674 1 0 #5 268096674 1 0 #6 268096674 2 0 #7 268096674 2 9 #8 268096674 2 0 #9 268096674 3 0 #10 268096674 3 0 </code></pre> What I did here is: create a vector of the existing column "Time" with all elements except for the last one in the group, which has the index <code>n()</code> and add to that vector a <code>0</code> as last element using <code>c()</code> for concatenation. Note that in my output the Time value in row 10 is also changed to 0 because it is considered the last entry of the group.

dplyr and tail to change last value in a group_by in r

Tags:

r

dplyr

tail

while using dplyr i'm having trouble changing the last value my data frame. i want to group by user and tag and change the Time to 0 for the last value / row in the group.

     user_id     tag   Time
1  268096674       1    3
2  268096674       1    10
3  268096674       1    1
4  268096674       1    0
5  268096674       1    9999
6  268096674       2    0
7  268096674       2    9
8  268096674       2    500
9  268096674       3    0
10 268096674       3    1
...

Desired output:

     user_id     tag   Time
1  268096674       1    3
2  268096674       1    10
3  268096674       1    1
4  268096674       1    0
5  268096674       1    0
6  268096674       2    0
7  268096674       2    9
8  268096674       2    0
9  268096674       3    0
10 268096674       3    1
...

I've tried to do something like this, among others and can't figure it out:

df %>%
  group_by(user_id,tag) %>%
  mutate(tail(Time) <- 0)

I tried adding a row number as well, but couldn't quite put it all together. any help would be appreciated.

238

asked Apr 26 '15 18:04

itjcms18

2 Answers

Here's an option:

df %>%
  group_by(user_id, tag) %>%
  mutate(Time = c(Time[-n()], 0))
#Source: local data frame [10 x 3]
#Groups: user_id, tag
#
#     user_id tag Time
#1  268096674   1    3
#2  268096674   1   10
#3  268096674   1    1
#4  268096674   1    0
#5  268096674   1    0
#6  268096674   2    0
#7  268096674   2    9
#8  268096674   2    0
#9  268096674   3    0
#10 268096674   3    0

What I did here is: create a vector of the existing column "Time" with all elements except for the last one in the group, which has the index n() and add to that vector a 0 as last element using c() for concatenation.

Note that in my output the Time value in row 10 is also changed to 0 because it is considered the last entry of the group.

141

answered Oct 18 '22 22:10

talat

I would like to offer an alternative approach which will avoid copying the whole column (what both Time[-n()] and replace do) and allow modifying in place

library(data.table)
indx <- setDT(df)[, .I[.N], by = .(user_id, tag)]$V1 # finding the last incidences per group
df[indx, Time := 0L] # modifying in place
df
#       user_id tag Time
#  1: 268096674   1    3
#  2: 268096674   1   10
#  3: 268096674   1    1
#  4: 268096674   1    0
#  5: 268096674   1    0
#  6: 268096674   2    0
#  7: 268096674   2    9
#  8: 268096674   2    0
#  9: 268096674   3    0
# 10: 268096674   3    0

answered Oct 18 '22 21:10

David Arenburg

Related questions
                            
                                Is there a way to paste documented R code into R console or Rstudio without the arrow or plus signs being registered?
                            
                                Subsetting a data.frame with an integer matrix
                            
                                Parsing ISO8601 date and time format in R [duplicate]
                            
                                Unlist all list elements in a dataframe
                            
                                Collapse intersecting regions
                            
                                fread from data.table package when column names include spaces and special characters?
                            
                                Cannot locate .Rprofile file [duplicate]
                            
                                Put a fixed title in an interactive 3D plot using rgl package, R
                            
                                Kronecker product for large matrices
                            
                                Possible to combine position_jitter with position_dodge?
                            
                                Scatter plot with ggplot2 colored by dates
                            
                                R: Dimension names in tables and multi-dimensional arrays
                            
                                BUGS error messages
                            
                                How to print three venn diagrams in the same window
                            
                                Efficient R code for finding indices associated with unique values in vector
                            
                                Combine/merge lists by elements names (list in list)
                            
                                Obtaining Separate Summary Statistics by Categorical Variable with Stargazer Package
                            
                                how to snip or crop or white-fill a large. expanded (by 10%) rectangle outside of a polygon with ggplot2
                            
                                Multiple ggplots with magrittr tee operator
                            
                                ggplot line graph with NA values

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With