I have a dataset that looks like the following:
rownum<-c(1,2,3,4,5,6,7,8,9,10)
name<-c("jeff","jeff","mary","jeff","jeff","jeff","mary","mary","mary","mary")
text<-c("a","b","c","d","e","f","g","h","i","j")
a<-data.table(rownum,name,text)
I would like to add a new column of text that adds from the previous column by rownum and name. The vector of the new column would be:
rolltext<-c("a","ab","c","abd","abde","abdef","cg","cgh","cghi","cghij"
I am at a loss here in terms of what to do. For numbers I would just use the cumsum function, but for text I am thinking I would need a for loop or to use one of the apply functions?
Here's an idea using substring()
.
a[, rolltext := substring(paste(text, collapse = ""), 1, 1:.N), by = name]
which gives
rownum name text rolltext
1: 1 jeff a a
2: 2 jeff b ab
3: 3 mary c c
4: 4 jeff d abd
5: 5 jeff e abde
6: 6 jeff f abdef
7: 7 mary g cg
8: 8 mary h cgh
9: 9 mary i cghi
10: 10 mary j cghij
We might be able to speed this up a bit with the stringi package
library(stringi)
a[, rolltext := stri_sub(stri_c(text, collapse = ""), length = 1:.N), by = name]
You can use Reduce
with the accumulate
option:
a[, rolltext := Reduce(paste0, text, accumulate = TRUE), by = name]
rownum name text rolltext
1: 1 jeff a a
2: 2 jeff b ab
3: 3 mary c c
4: 4 jeff d abd
5: 5 jeff e abde
6: 6 jeff f abdef
7: 7 mary g cg
8: 8 mary h cgh
9: 9 mary i cghi
10: 10 mary j cghij
Alternately, as @DavidArenburg suggested, construct each row using sapply
:
a[, rolltext := sapply(1:.N, function(x) paste(text[1:x], collapse = '')), by = name]
This is a running sum, while a rolling sum (in the OP's title) is something different, at least in R lingo.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With