I have the following vector:
A:(NA NA NA NA 1 NA NA 4 NA NA 1 NA NA NA NA NA 4 NA 1 NA 4)
I would like to replace all the Nas between 1 and 4 with 2 (but not the Nas between 4 and 1)
Are there any approaches you would recommend/use for this task?
It may also be managed as a dataframe:
A
----
NA
NA
NA
NA
1
NA
NA
4
NA
NA
1
NA
NA
NA
NA
NA
4
NA
1
NA
4
----
Edit: 1. I changed the string "Na" to NA.
SOLUTION/UPDATE Thank you to everyone for your insights. I learnt from them to come up with the following solution to my case. I hope it is useful to someone else:
A <- c(df$A)
index.1<-which(df$A %in% c(1)) # define location for 1s in A
index.14<-which(df$A %in% c(1,4)) # define location for 1s and 4s in A
loc.1<-which(index.14 %in% index.1) # location of 1s in index.14
loc.4<-loc.1+1 # location of 4s relative to 1s in index.14
start.i<-((index.14[loc.1])+1) # starting index for replacing with 2
end.i<-((index.14[loc.4])-1) # ending index for replacing with 2 in index
fill.v<-sort(c(start.i, end.i))# sequence of indexes to fill-in with # 2
# create matrix of beginning and ending sequence
fill.m<-matrix(fill.v,nrow = (length(fill.v)/2),ncol = 2, byrow=TRUE)
# create a list with indexes to replace
list.1<-apply(fill.m, MARGIN=1,FUN=function(x) seq(x[1],x[2]))
# unlist list to use as the indexes for replacement
list.2<-unlist(list.1)
df$A[list.2] <- 2 # replace indexed location with 2
Insert Zeros for NA Values in an R Vector (or Column) As you have seen in the previous examples, R replaces NA with 0 in multiple columns with only one line of code. However, we need to replace only a vector or a single column of our database. Let’s find out how this works. First, create some example vector with missing values.
library ("dplyr") df <- tibble (x = c (11, 21, NA), y = c ("x", NA, "y")) print (df) cat ("After replacing NAs", " ") df %>% tidyr::replace_na (list (x = "NonNA", y = "NonNA")) As you can see that we have replaced NA values with NonNA. You can use the replace_na () function to replace NA values in Vector.
Sometimes we have vectors with NA values, also there might be a situation that one of vector having an NA at a position and the other vector has the numerical values at the same position. For example, 1, 2, NA and 1, 2, 3.
replace_na (data, replace, ...) data: It is a data frame or Vector. replace: If the data is a Vector, the replace takes a single value. If the data is a data frame, the replace takes a list of values, with one value for each column that has NA values to be replaced. If the input data is a data frame, the replace_na () method returns a data frame.
Assuming A
is as shown reproducibly in the Note at the end, the difference of cumsum's shown gives TRUE for the elements between 1 and 4 inclusive and the next condition eliminates the endpoints. Finally we replace the positions having TRUE in what is left with 2.
replace(A, (cumsum(A == 1) - cumsum(A == 4)) & (A == "Na"), 2)
giving:
[1] "Na" "Na" "Na" "Na" "1" "2" "2" "4" "Na" "Na" "1" "2" "2" "2" "2"
[16] "2" "4" "Na" "1" "2" "4"
R is case sensitive and Na is not the same as NA. The sample data in the question showed Na values and not NA values but if what was actually meant was a numeric vector with NA values as in AA
in the Note below then modify the expression to be as shown here:
replace(AA, cumsum(!is.na(AA) & AA == 1) - cumsum(!is.na(AA) & AA == 4) & is.na(AA), 2)
giving:
[1] NA NA NA NA 1 2 2 4 NA NA 1 2 2 2 2 2 4 NA 1 2 4
A <- c("Na", "Na", "Na", "Na", "1", "Na", "Na", "4", "Na", "Na",
"1", "Na", "Na", "Na", "Na", "Na", "4", "Na", "1", "Na", "4")
AA <- as.numeric(replace(A, A == "Na", NA))
I'm sure there's a better solution to this problem but this should do the trick:
A <-
c(NA, NA, NA, NA, 1, NA, NA, 4, NA, NA, 1, NA, NA, NA, NA, NA, 4, NA, 1, NA, 4)
replace <- FALSE
for (i in 1:length(A)) {
if (!is.na(A[i])) {
if (A[i] == 1) {
start <- i + 1
replace <- TRUE
}
if (A[i] == 4 & replace == TRUE) {
A[start:(i - 1)] <- 2
replace <- FALSE
}
}
}
EDIT: if you only want to replace the NAs if there's nothing else (for example a 3) between the 1 and the 3 you could use this:
A <-
c(NA, NA, NA, NA, 1, NA, 3, 4, NA, NA, 1, NA, NA, NA, NA, NA, 4, NA, 1, NA, 4)
replace <- FALSE
for (i in 1:length(A)) {
if (!is.na(A[i])) {
if (A[i] == 1) {
start <- i + 1
replace <- TRUE
}
if (A[i] == 4 & replace == TRUE) {
A[start:(i - 1)] <- 2
replace <- FALSE
}
if (A[i] != 4 & A[i] != 1){
replace <- FALSE
}
}
}
Output:
> A
[1] NA NA NA NA 1 NA 3 4 NA NA 1 2 2 2 2 2 4 NA 1 2 4
And if you only want to replace NAs but keep other values between 1 and 4 use this:
A <-
c(NA, NA, NA, NA, 1, NA, 3, 4, NA, NA, 1, NA, NA, NA, NA, NA, 4, NA, 1, NA, 4)
replace <- FALSE
for (i in 1:length(A)) {
if (!is.na(A[i])) {
if (A[i] == 1) {
start <- i + 1
replace <- TRUE
}
if (A[i] == 4 & replace == TRUE) {
sub <- A[start:(i - 1)]
sub[is.na(sub)] <- 2
A[start:(i - 1)] <- sub
replace <- FALSE
}
}
}
Output:
> A
[1] NA NA NA NA 1 2 3 4 NA NA 1 2 2 2 2 2 4 NA 1 2 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With