I have the following vector: <pre class="prettyprint"><code>A:(NA NA NA NA 1 NA NA 4 NA NA 1 NA NA NA NA NA 4 NA 1 NA 4) </code></pre> I would like to replace all the Nas between 1 and 4 with 2 (but not the Nas between 4 and 1) Are there any approaches you would recommend/use for this task? It may also be managed as a dataframe: <pre class="prettyprint"><code> A ---- NA NA NA NA 1 NA NA 4 NA NA 1 NA NA NA NA NA 4 NA 1 NA 4 ---- </code></pre> Edit: 1. I changed the string "Na" to NA. SOLUTION/UPDATE Thank you to everyone for your insights. I learnt from them to come up with the following solution to my case. I hope it is useful to someone else: <pre class="prettyprint"><code>A <- c(df$A) index.1<-which(df$A %in% c(1)) # define location for 1s in A index.14<-which(df$A %in% c(1,4)) # define location for 1s and 4s in A loc.1<-which(index.14 %in% index.1) # location of 1s in index.14 loc.4<-loc.1+1 # location of 4s relative to 1s in index.14 start.i<-((index.14[loc.1])+1) # starting index for replacing with 2 end.i<-((index.14[loc.4])-1) # ending index for replacing with 2 in index fill.v<-sort(c(start.i, end.i))# sequence of indexes to fill-in with # 2 # create matrix of beginning and ending sequence fill.m<-matrix(fill.v,nrow = (length(fill.v)/2),ncol = 2, byrow=TRUE) # create a list with indexes to replace list.1<-apply(fill.m, MARGIN=1,FUN=function(x) seq(x[1],x[2])) # unlist list to use as the indexes for replacement list.2<-unlist(list.1) df$A[list.2] <- 2 # replace indexed location with 2 </code></pre>

I'm sure there's a better solution to this problem but this should do the trick: <pre class="prettyprint"><code>A <- c(NA, NA, NA, NA, 1, NA, NA, 4, NA, NA, 1, NA, NA, NA, NA, NA, 4, NA, 1, NA, 4) replace <- FALSE for (i in 1:length(A)) { if (!is.na(A[i])) { if (A[i] == 1) { start <- i + 1 replace <- TRUE } if (A[i] == 4 & replace == TRUE) { A[start:(i - 1)] <- 2 replace <- FALSE } } } </code></pre> <hr> EDIT: if you only want to replace the NAs if there's nothing else (for example a 3) between the 1 and the 3 you could use this: <pre class="prettyprint"><code>A <- c(NA, NA, NA, NA, 1, NA, 3, 4, NA, NA, 1, NA, NA, NA, NA, NA, 4, NA, 1, NA, 4) replace <- FALSE for (i in 1:length(A)) { if (!is.na(A[i])) { if (A[i] == 1) { start <- i + 1 replace <- TRUE } if (A[i] == 4 & replace == TRUE) { A[start:(i - 1)] <- 2 replace <- FALSE } if (A[i] != 4 & A[i] != 1){ replace <- FALSE } } } </code></pre> Output: <pre class="prettyprint"><code>> A [1] NA NA NA NA 1 NA 3 4 NA NA 1 2 2 2 2 2 4 NA 1 2 4 </code></pre> <hr> And if you only want to replace NAs but keep other values between 1 and 4 use this: <pre class="prettyprint"><code>A <- c(NA, NA, NA, NA, 1, NA, 3, 4, NA, NA, 1, NA, NA, NA, NA, NA, 4, NA, 1, NA, 4) replace <- FALSE for (i in 1:length(A)) { if (!is.na(A[i])) { if (A[i] == 1) { start <- i + 1 replace <- TRUE } if (A[i] == 4 & replace == TRUE) { sub <- A[start:(i - 1)] sub[is.na(sub)] <- 2 A[start:(i - 1)] <- sub replace <- FALSE } } } </code></pre> Output: <pre class="prettyprint"><code>> A [1] NA NA NA NA 1 2 3 4 NA NA 1 2 2 2 2 2 4 NA 1 2 4 </code></pre>

In R: How to replace NA in a Vector found between two integers

Tags:

r

vector

I have the following vector:

A:(NA NA NA NA 1 NA NA 4 NA NA 1 NA NA NA NA NA 4 NA 1 NA 4)

I would like to replace all the Nas between 1 and 4 with 2 (but not the Nas between 4 and 1)

Are there any approaches you would recommend/use for this task?

It may also be managed as a dataframe:

 A 
----
 NA 
 NA 
 NA 
 NA 
 1 
 NA 
 NA 
 4 
 NA 
 NA 
 1 
 NA 
 NA 
 NA 
 NA 
 NA
 4 
 NA 
 1
 NA 
 4
----

Edit: 1. I changed the string "Na" to NA.

SOLUTION/UPDATE Thank you to everyone for your insights. I learnt from them to come up with the following solution to my case. I hope it is useful to someone else:

A <- c(df$A)

index.1<-which(df$A %in% c(1)) # define location for 1s in A
index.14<-which(df$A %in% c(1,4)) # define location for 1s and 4s in A

loc.1<-which(index.14 %in% index.1) # location of 1s in  index.14
loc.4<-loc.1+1 # location of 4s relative to 1s in index.14

start.i<-((index.14[loc.1])+1) # starting index for replacing with 2
end.i<-((index.14[loc.4])-1) # ending index for replacing with 2 in index

fill.v<-sort(c(start.i, end.i))# sequence of indexes to fill-in with # 2

# create matrix of beginning and ending sequence
fill.m<-matrix(fill.v,nrow = (length(fill.v)/2),ncol = 2, byrow=TRUE) 

# create a list with indexes to replace
list.1<-apply(fill.m, MARGIN=1,FUN=function(x) seq(x[1],x[2])) 

# unlist list to use as the indexes for replacement
list.2<-unlist(list.1) 

df$A[list.2] <- 2 # replace indexed location with 2

762

asked Mar 04 '19 15:03

Anthony O'Brien

Video Answer

2 Answers

Assuming A is as shown reproducibly in the Note at the end, the difference of cumsum's shown gives TRUE for the elements between 1 and 4 inclusive and the next condition eliminates the endpoints. Finally we replace the positions having TRUE in what is left with 2.

replace(A, (cumsum(A == 1) - cumsum(A == 4)) & (A == "Na"), 2)

giving:

 [1] "Na" "Na" "Na" "Na" "1"  "2"  "2"  "4"  "Na" "Na" "1"  "2"  "2"  "2"  "2" 
[16] "2"  "4"  "Na" "1"  "2"  "4"

NA values

R is case sensitive and Na is not the same as NA. The sample data in the question showed Na values and not NA values but if what was actually meant was a numeric vector with NA values as in AA in the Note below then modify the expression to be as shown here:

replace(AA, cumsum(!is.na(AA) & AA == 1) - cumsum(!is.na(AA) & AA == 4) & is.na(AA), 2)

giving:

[1] NA NA NA NA  1  2  2  4 NA NA  1  2  2  2  2  2  4 NA  1  2  4

Note

A <- c("Na", "Na", "Na", "Na", "1", "Na", "Na", "4", "Na", "Na", 
"1", "Na", "Na", "Na", "Na", "Na", "4", "Na", "1", "Na", "4")

AA <- as.numeric(replace(A, A == "Na", NA))

answered Oct 01 '22 00:10

G. Grothendieck

I'm sure there's a better solution to this problem but this should do the trick:

A <-
  c(NA, NA, NA, NA, 1, NA, NA, 4, NA, NA, 1, NA, NA, NA, NA, NA, 4, NA, 1, NA, 4)

replace <- FALSE

for (i in 1:length(A)) {
  if (!is.na(A[i])) {
    if (A[i] == 1) {
      start <- i + 1
      replace <- TRUE
    }
    if (A[i] == 4 & replace == TRUE) {
      A[start:(i - 1)] <- 2
      replace <- FALSE
    }
  }
}

EDIT: if you only want to replace the NAs if there's nothing else (for example a 3) between the 1 and the 3 you could use this:

A <-
  c(NA, NA, NA, NA, 1, NA, 3, 4, NA, NA, 1, NA, NA, NA, NA, NA, 4, NA, 1, NA, 4)

replace <- FALSE

for (i in 1:length(A)) {
  if (!is.na(A[i])) {
    if (A[i] == 1) {
      start <- i + 1
      replace <- TRUE
    }
    if (A[i] == 4 & replace == TRUE) {
      A[start:(i - 1)] <- 2
      replace <- FALSE
    }
    if (A[i] != 4 & A[i] != 1){
      replace <- FALSE
    }
  }
}

Output:

> A
 [1] NA NA NA NA  1 NA  3  4 NA NA  1  2  2  2  2  2  4 NA  1  2  4

And if you only want to replace NAs but keep other values between 1 and 4 use this:

A <-
  c(NA, NA, NA, NA, 1, NA, 3, 4, NA, NA, 1, NA, NA, NA, NA, NA, 4, NA, 1, NA, 4)

replace <- FALSE

for (i in 1:length(A)) {
  if (!is.na(A[i])) {
    if (A[i] == 1) {
      start <- i + 1
      replace <- TRUE
    }
    if (A[i] == 4 & replace == TRUE) {
      sub <- A[start:(i - 1)]
      sub[is.na(sub)] <- 2
      A[start:(i - 1)] <- sub
      replace <- FALSE
    }
  }
}

Output:

> A
 [1] NA NA NA NA  1  2  3  4 NA NA  1  2  2  2  2  2  4 NA  1  2  4

answered Oct 01 '22 01:10

brettljausn

Related questions
                            
                                Get name of x when defining `(<-` operator
                            
                                R: Why does mean(NA, na.rm = TRUE) return NaN
                            
                                Separate a column into new columns based on the number of leading spaces
                            
                                Counting unequal elements in-between equal elements in R df column
                            
                                Saving several variables in a single RDS file
                            
                                Group geom_point with the geom_polygon
                            
                                Using file.exist in R
                            
                                dcast fails to cast character column when the data size is large
                            
                                Keep points in gganimate
                            
                                Installing tidyverse on Ubuntu 18.x & R 3.4.4/3.5.1
                            
                                R - finding pattern in a column and replacing it (more efficient solution)
                            
                                How to extract stan code from rstanarm object
                            
                                create a matrix in `R` and each element in that matrix is another matrix
                            
                                Function parameter; passing variable name without quotes
                            
                                Make Y-axis start at 1 instead of 0 within ggplot bar chart
                            
                                Is there a way to make a kable without lines/borders for pdf?
                            
                                Icons in data table in Shiny
                            
                                join data frames and replace one column with another
                            
                                How to fix an error when adding a manual scale in ggplot?
                            
                                How to change alpha in geom_sf?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With