I have a data.frame similar to this
id <- c(1,1,1,2,2,3,3,3,3,3)
action <- c("for","l","for","f","l","l","for","for","for","f")
time <- c(45,35,24,56,100,121,30,10,35,143)
dframe <- data.frame(id,action,time)
Only the action "for" repeats itself in consecutive rows within each unique id. I would like to collapse these rows into one row that sums the time for action "for". I want to do this only within each unique id AND when they follow each other (as in id==3, not id==1)
I tried the following code but this does not distinguish between the actions that follow one after another but rather sums all occurrences of "for" within the unique id.
aggregate(action_time ~ id + act, data=mean.event, FUN=sum)
Thanks for your time.
Using rle()
, inverse.rle()
, and the data.table package:
## Reproduce example data, naming it df and setting stringsAsFactors=FALSE
id <- c(1,1,1,2,2,3,3,3,3,3)
action <- c("for","l","for","f","l","l","for","for","for","f")
time <- c(45,35,24,56,100,121,30,10,35,143)
df <- data.frame(id,action,time, stringsAsFactors=FALSE)
## Use rle() and inverse.rle() to give each run of "for"s a distinct name
r <- rle(df$action)
r$values <- paste0(r$values, seq_along(r$values))
(r <- inverse.rle(r))
# [1] "for1" "l2" "for3" "f4" "l5" "l5" "for6" "for6" "for6" "f7"
## Use data.table to subset by run of "for"s *and* by id, collapsing only
## sub-data.tables consisting of consecutive "for"s within an id.
library(data.table)
dt <- data.table(df)
dt[ , if(action[1]=="for") {
X <- .SD[1,]
X$time <- sum(time)
X
} else {.SD},
by=list(r, id)][,-1,with=FALSE]
# id action time
# 1: 1 for 45
# 2: 1 l 35
# 3: 1 for 24
# 4: 2 f 56
# 5: 2 l 100
# 6: 3 l 121
# 7: 3 for 75
# 8: 3 f 143
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With