In my data frame, I want to replace certain blank cells and cells with values with NA. But the cells I want to replace with NAs has nothing to do with the value that cell stores, but with the combination of row and column it is stored in.
Here's a sample data frame DF:
Fruits Price Weight Number of pieces
Apples 20 2 10
Oranges 15 4 16
Pineapple 40 8 6
Avocado 60 5 20
I want to replace Pineapple'e weight to NA and Orange's number of pieces to NA.
DF$Weight[3] <- NA
DF$`Number of pieces`[2] <- NA
This replaces any value that's stored in that position and that may change. I want to use specific row and column names to do this replacement so the position of value becomes irrelevant.
Output:
Fruits Price Weight Number of pieces
Apples 20 2 10
Oranges 15 4 NA
Pineapple 40 NA 6
Avocado 60 5 20
But if order of the table is changed, this would replace wrong values with NA.
How should I do this?
Replacing values in a data frame is a very handy option available in R for data analysis. Using replace () in R, you can switch NA, 0, and negative values with appropriate to clear up large datasets for analysis. Congratulations, you learned to replace the values in R. Keep going!
Furthermore, we could replace a value by NA instead of a character. However, with factors it gets a bit more complicated… Now, let’s try to apply the same type of R syntax as in Example 1 to our factor column x4:
The .replace () method is extremely powerful and lets you replace values across a single column, multiple columns, and an entire dataframe. The method also incorporates regular expressions to make complex replacements easier. To learn more about the Pandas .replace () method, check out the official documentation here.
The replace () function in R syntax is very simple and easy to implement. It includes the vector, index vector, and the replacement values as well as shown below. This section will show how to replace a value in a vector. Execute the below code for the same.
Since your data structure is 2 dimensional, you can find the indices of the rows containing a specific value first and then use this information.
which(DF$Fruits == "Pineapple")
[1] 3
DF$Weight[which(DF$Fruits == "Pineapple")] <- NA
You should be aware of that which
will return a vector, so if you have multiple fruits called "Pineapple" then the previous command will return all indices of them.
Here is a way using function is.na<-
.
is.na(DF$Weight) <- DF$Fruits == "Pineapple"
is.na(DF$`Number of pieces`) <- DF$Fruits == "Oranges"
DF
# Fruits Price Weight Number of pieces
#1 Apples 20 2 10
#2 Oranges 15 4 NA
#3 Pineapple 40 NA 6
#4 Avocado 60 5 20
Data in dput
format.
DF <-
structure(list(Fruits = structure(c(1L, 3L, 4L, 2L),
.Label = c("Apples", "Avocado", "Oranges", "Pineapple"),
class = "factor"), Price = c(20L, 15L, 40L, 60L),
Weight = c(2L, 4L, 8L, 5L), `Number of pieces` = c(10L,
16L, 6L, 20L)), class = "data.frame", row.names = c(NA, -4L))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With