Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tracking full Level change row by row

This is what my dataframe looks like. The rightmost column("FullCycle") is my desired column. For a given name and at a given point in time, I want to see the whole cycle of level changes for a person.

 library(data.table)
     dt <- fread('
        Name      Level     Date         RecentLevelChange  FullCycle
        John       1       2016-01-01       NA                1
        John       1       2016-01-10       NA                1
        John       2       2016-01-17       1->2              1->2
        John       2       2016-01-18       NA                1->2
        John       3       2016-01-19       2->3              1->2->3
        John       4       2016-01-20       3->4              1->2->3->4
        John       4       2016-01-21       NA                1->2->3->4
        John       7       2016-01-22       4->7              1->2->3->4->7
        Tom        1       2016-01-10       NA                1
        Tom        2       2016-01-17       1->2              1->2
        Tom        2       2016-01-18       NA                1->2
        Tom        3       2016-01-19       2->3              1->2->3
        Tom        4       2016-01-20       3->4              1->2->3->4
        Tom        4       2016-01-21       NA                1->2->3->4
        Tom        7       2016-01-22       4->7              1->2->3->4->7
  ')

I have created the field "RecentLevelChange" by trying

require(dplyr)
dt[,RecentLevelChange := 
as.character(ifelse(lag(Level)==Level  ,NA,
paste(lag(Level),Level,sep="->"))),by=Name]

But I dont know how to create the "FullCycle' Column. I sincerely appreciate your help.

like image 242
gibbz00 Avatar asked Jan 07 '23 04:01

gibbz00


1 Answers

Here's a helper function to calculate the paths

paths <- function(x) {
    sapply(Reduce(function(prev, cur) 
        unique(c(prev,cur)), x, accumulate=T), 
        function(x) paste(x, collapse="->")
    )
 }

Use use Reduce() to build lists of unique levels up to a given point. (This assumes the rows are properly sorted). We can then apply this function to each person

dt[,path:=paths(Level), by="Name"]

This produces

    Name Level       Date RecentLevelChange          path
 1: John     1 2016-01-01                NA             1
 2: John     1 2016-01-10                NA             1
 3: John     2 2016-01-17              1->2          1->2
 4: John     2 2016-01-18                NA          1->2
 5: John     3 2016-01-19              2->3       1->2->3
 6: John     4 2016-01-20              3->4    1->2->3->4
 7: John     4 2016-01-21                NA    1->2->3->4
 8: John     7 2016-01-22              4->7 1->2->3->4->7
 9:  Tom     1 2016-01-10                NA             1
10:  Tom     2 2016-01-17              1->2          1->2
11:  Tom     2 2016-01-18                NA          1->2
12:  Tom     3 2016-01-19              2->3       1->2->3
13:  Tom     4 2016-01-20              3->4    1->2->3->4
14:  Tom     4 2016-01-21                NA    1->2->3->4
15:  Tom     7 2016-01-22              4->7 1->2->3->4->7

If you want to track if users go back to levels that they were at previously, you can use something like this instead

paths <- function(x) {
    sapply(Reduce(function(prev, cur) 
        rle(c(prev,cur))$values, x, accumulate=T), 
        function(x) paste(x, collapse="->")
    )
 }

for example

paths(c(1,2,3,2,1))
# [1] "1"             "1->2"          "1->2->3"       "1->2->3->2"   
# [5] "1->2->3->2->1"
like image 69
MrFlick Avatar answered Jan 15 '23 22:01

MrFlick