Logo Questions Linux Laravel Mysql Ubuntu Git Menu

R dplyr: Find a specific value in a column, then replace the adjacent cell in the subsequent columns to the right with that value


I am trying to create a matrix of site and time-of-event. In my case, once the event has occurred ("1") it is permanent and cannot go back to a "0". Once a cell in a column is a "1" I am trying to populate the adjacent cell in the subsequent columns to the right with a "1" (see bellow example).

site <- c('A','B','C','D','E','F','G') #site
time <- c(0,1,4,0,3,2,0) # time in which even occured
event <- c(0,1,1,0,1,1,0) # did a event occur
data <- data.frame(site, time, event)

site.time.matrix <- cast(data, site~time)

# This is the output      # This is the desired output
#site   0  1  2  3  4     #site   0  1  2  3  4
#    A  0 NA NA NA NA     #    A  0  0  0  0  0
#    B NA  1 NA NA NA     #    B  0  1  1  1  1
#    C NA NA NA NA  1     #    C  0  0  0  0  1
#    D  0 NA NA NA NA     #    D  0  0  0  0  0
#    E NA NA NA  1 NA     #    E  0  0  0  1  1
#    F NA NA  1 NA NA     #    F  0  0  1  1  1
#    G  0 NA NA NA NA     #    G  0  0  0  0  0

I have found some promising code using dplyr e.g. (Replacing more than one elements with replace function or Apply function to each column in a data frame observing each columns existing data type) which replaces values, although I am unsure of how to specify the adjacent cell in subsequent columns argument.

My apologies if this question is unclear, this is my first post on StackOverflow.

Thank you.

like image 340
CarlaBirdy Avatar asked Oct 19 '16 10:10


People also ask

How do you replace a value in a column with another value in R?

To replace a column value in R use square bracket notation df[] , By using this you can update values on a single column or on all columns. To refer to a single column use df$column_name .

How do I change the value of a column in R using Dplyr?

Use mutate() and its other verbs mutate_all() , mutate_if() and mutate_at() from dplyr package to replace/update the values of the column (string, integer, or any type) in R DataFrame (data. frame).

2 Answers

It was welcome surprise for a first user post to be detailed, reproducible and interesting, +1!

With na.locf from zoo package you could do:

library(reshape) # for cast function
library(zoo)    #for na.locf function short for if NA, last observation carrried forward, ?na.locf

site <- c('A','B','C','D','E','F','G') #site
time <- c(0,1,4,0,3,2,0) # time in which even occured
event <- c(0,1,1,0,1,1,0) # did a event occur
data <- data.frame(site, time, event)

site.time.matrix <- reshape::cast(data, site~time)

site.time.matrix.fill <- site.time.matrix

# Transpose the matrix excluding first column, carry forward last observation and 
# transpose again to return to original matrix structure

site.time.matrix.fill[,-1] <- t(na.locf(t(site.time.matrix.fill[,-1])))

site.time.matrix.fill[is.na( site.time.matrix.fill)] <- 0


#  site 0 1 2 3 4
#1    A 0 0 0 0 0
#2    B 0 1 1 1 1
#3    C 0 0 0 0 1
#4    D 0 0 0 0 0
#5    E 0 0 0 1 1
#6    F 0 0 1 1 1
#7    G 0 0 0 0 0
like image 176
Silence Dogood Avatar answered Sep 24 '22 01:09

Silence Dogood

A base R approach using apply.

Basically, for every row we are trying to find any element that has 1 in it and assigning 0 to every element in left of it and 1 for every element to the right.

t(apply(site.time.matrix, 1, function(x) {
       temp = if(any(x == 1, na.rm = T)) which(x==1)-1 else length(x)
       x[temp:length(x)] <- 1
       x[0:temp] <- 0

#  0 1 2 3 4
#A 0 0 0 0 0
#B 0 1 1 1 1
#C 0 0 0 0 1
#D 0 0 0 0 0
#E 0 0 0 1 1
#F 0 0 1 1 1
#G 0 0 0 0 0
like image 44
Ronak Shah Avatar answered Sep 25 '22 01:09

Ronak Shah