Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Equivalent to cumsum for string in R [duplicate]

Tags:

r

I am looking for a way to do what would be the equivalent of a cumulative sum in R for string/character-formatted text instead of numbers. The different text fields should be concatenated.

E.g. in the data frame "df":

Column A contains the input, column B the desired result.

  A        B
1 banana   banana 
2 boats    banana boats
3 are      banana boats are
4 awesome  banana boats are awesome

Currently I am solving this via the following loop

df$B <- ""

for(i in 1:nrow(df)) {
    if (length(df[i-1,"A"]) > 0) {
        df$B[i] <- paste(df$B[i-1],df$A[i])
    } else {
        df$B[i] <- df$A[i]
    }
}

I wonder whether there exists a more elegant/faster solution.

like image 674
Phil Avatar asked Feb 12 '16 12:02

Phil


3 Answers

(df$B <- Reduce(paste, as.character(df$A), accumulate = TRUE))
# [1] "banana"     "banana boats"      "banana boats are"    "banana boats are awesome"
like image 155
Julius Vainora Avatar answered Oct 11 '22 02:10

Julius Vainora


We can try

 i1 <- sequence(seq_len(nrow(df1)))
 tapply(df1$A[i1], cumsum(c(TRUE,diff(i1) <=0)),
                     FUN= paste, collapse=' ')

Or

 i1 <- rep(seq(nrow(df1)), seq(nrow(df1)))
 tapply(i1, i1, FUN= function(x) 
          paste(df1$A[seq_along(x)], collapse=' ') )
like image 33
akrun Avatar answered Oct 11 '22 00:10

akrun


I don't know if it's faster, but at least the code is shorter:

sapply(seq_along(df$A),function(x){paste(A[1:x], collapse=" ")})

Thanks to Rolands comment, I realised that this was one of the rare occurences where a for-loop could be useful, as it saves us the repeated indexing. It differs from OP's as it starts at 2, saving the need for the if statment inside the forloop.

res <- c(NA, length(df1$A))
res[1] <- as.character(df1$A[1])
for(i in 2:length(df1$A)){
   res[i] <- paste(res[i-1],df1$A[i])
 }
res
like image 40
Heroka Avatar answered Oct 11 '22 00:10

Heroka