Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

change data in column with the previous information in another column

Tags:

r

I have a dataset with three columns. The column user has two actions including action1 and action2. action2 only contains information if action1 column has A data. I want to concatenate P data in action1 with the previous data in action2. For example, if action2 has ac, and the next row has P in action1, I want to change P into Pac, and will continue (all P turn into Pac) until action2 change. Take note that this process should repeat for each user.

df<-read.table(text="
user   action1    action2 
1        A          a
1        B          NA
1        P          NA
1        P          NA
1        A          ac
1        P          NA
2        B          NA
2        P          NA
2        A          aa
2        P          NA
2        AB         aa",header=T)

result: (I highlighted those rows that infected)
user   action1    action2 
1        A          a
1        B          NA
1        Pa         NA <-
1        Pa         NA <-
1        A          ac
1        Pac        NA <-
2        B          NA
2        P          NA
2        A          aa
2        Paa        NA <-
2        AB         NA

Thank you

like image 813
Cina Avatar asked Oct 16 '22 04:10

Cina


2 Answers

library('data.table')
library('zoo')
# Using zoo::na.locf(), fill NA with the previous value and group by user. Also `na.locf` will not remove NA.
setDT(df)[, V3 := na.locf(action2, na.rm = FALSE), by = .(user)]     
# combine action1 with V3 column if action1 is equal to 'P' and it is not NA.
df[action1 == 'P' & !(is.na(V3)), action1 := paste0(action1, V3)] 
df[, V3 := NULL] # remove V3 column
df
#    user action1 action2
# 1:    1       A       a
# 2:    1       B      NA
# 3:    1      Pa      NA
# 4:    1      Pa      NA
# 5:    1       A      ac
# 6:    1     Pac      NA
# 7:    2       B      NA
# 8:    2       P      NA
# 9:    2       A      aa
# 10:    2     Paa      NA
# 11:    2      AB      aa

Data:

df<-read.table(text="
user   action1    action2 
               1        A          a
               1        B          NA
               1        P          NA
               1        P          NA
               1        A          ac
               1        P          NA
               2        B          NA
               2        P          NA
               2        A          aa
               2        P          NA
               2        AB         aa",header=T, stringsAsFactors = FALSE)
like image 54
Sathish Avatar answered Nov 15 '22 12:11

Sathish


Here's an option with the tidyverse

library(dplyr)
library(tidyr)

# for each user carry valid action2 results forward 
df1 <- df %>% 
  group_by(user) %>% 
  tidyr::fill(action2) 

# replace missing values so they can be pasted
df1$action2 <- ifelse(is.na(df1$action2), "", df1$action2)

# add valid action2 results to action1 where action1 is "P"
df1$new <- ifelse(df1$action1 == "P",
                  paste0("P", df1$action2),
                  df1$action1)

# A tibble: 11 x 4
# Groups:   user [2]
    user action1 action2 new  
   <int> <chr>   <chr>   <chr>
 1     1 A       "a"     A    
 2     1 B       "a"     B    
 3     1 P       "a"     Pa   
 4     1 P       "a"     Pa   
 5     1 A       "ac"    A    
 6     1 P       "ac"    Pac  
 7     2 B       ""      B    
 8     2 P       ""      P    
 9     2 A       "aa"    A    
10     2 P       "aa"    Paa  
11     2 AB      "aa"    AB  

stringAsFactors = FALSE was added to the read in data:

df<-read.table(text="
user   action1    action2 
1        A          a
1        B          NA
1        P          NA
1        P          NA
1        A          ac
1        P          NA
2        B          NA
2        P          NA
2        A          aa
2        P          NA
2        AB         aa",header=T, stringsAsFactors = FALSE)
like image 36
Jay Achar Avatar answered Nov 15 '22 13:11

Jay Achar