In the R data frame coded for below, I would like to replace all of the times that B
appears with b
.
junk <- data.frame(x <- rep(LETTERS[1:4], 3), y <- letters[1:12])
colnames(junk) <- c("nm", "val")
this provides:
nm val
1 A a
2 B b
3 C c
4 D d
5 A e
6 B f
7 C g
8 D h
9 A i
10 B j
11 C k
12 D l
My initial attempt was to use a for
and if
statements like so:
for(i in junk$nm) if(i %in% "B") junk$nm <- "b"
but as I am sure you can see, this replaces ALL of the values of junk$nm
with b
. I can see why this is doing this but I can't seem to get it to replace only those cases of junk$nm where the original value was B
.
NOTE: I managed to solve the problem with gsub
but in the interest of learning R I still would like to know how to get my original approach to work (if it is possible)
Replace column values based on checking logical conditions in R DataFrame is pretty straightforward. All you need to do is select the column vector you wanted to update and use the condition within [] .
replace() function in R Language is used to replace the values in the specified string vector x with indices given in list by those given in values. It takes on three parameters first is the list name, then the index at which the element needs to be replaced, and the third parameter is the replacement values.
To replace zero with previous value in an R data frame column, we can use na.
Use R dplyr::coalesce() to replace NA with 0 on multiple dataframe columns by column name and dplyr::mutate_at() method to replace by column name and index. tidyr:replace_na() to replace. Using these methods and packages you can also replace NA with an empty string in R dataframe.
Easier to convert nm to characters and then make the change:
junk$nm <- as.character(junk$nm)
junk$nm[junk$nm == "B"] <- "b"
EDIT: And if indeed you need to maintain nm as factors, add this in the end:
junk$nm <- as.factor(junk$nm)
another useful way to replace values
library(plyr)
junk$nm <- revalue(junk$nm, c("B"="b"))
Short answer is:
junk$nm[junk$nm %in% "B"] <- "b"
Take a look at Index vectors in R Introduction (if you don't read it yet).
EDIT. As noticed in comments this solution works for character vectors so fail on your data.
For factor best way is to change level:
levels(junk$nm)[levels(junk$nm)=="B"] <- "b"
As the data you show are factors, it complicates things a little bit. @diliop's Answer approaches the problem by converting to nm
to a character variable. To get back to the original factors a further step is required.
An alternative is to manipulate the levels of the factor in place.
> lev <- with(junk, levels(nm))
> lev[lev == "B"] <- "b"
> junk2 <- within(junk, levels(nm) <- lev)
> junk2
nm val
1 A a
2 b b
3 C c
4 D d
5 A e
6 b f
7 C g
8 D h
9 A i
10 b j
11 C k
12 D l
That is quite simple and I often forget that there is a replacement function for levels()
.
Edit: As noted by @Seth in the comments, this can be done in a one-liner, without loss of clarity:
within(junk, levels(nm)[levels(nm) == "B"] <- "b")
The easiest way to do this in one command is to use which
command and also need not to change the factors into character by doing this:
junk$nm[which(junk$nm=="B")]<-"b"
If you are working with character variables (note that stringsAsFactors
is false here) you can use replace:
junk <- data.frame(x <- rep(LETTERS[1:4], 3), y <- letters[1:12], stringsAsFactors = FALSE)
colnames(junk) <- c("nm", "val")
junk$nm <- replace(junk$nm, junk$nm == "B", "b")
junk
# nm val
# 1 A a
# 2 b b
# 3 C c
# 4 D d
# ...
You have created a factor variable in nm
so you either need to avoid doing so or add an additional level to the factor attributes. You should also avoid using <-
in the arguments to data.frame()
Option 1:
junk <- data.frame(x = rep(LETTERS[1:4], 3), y =letters[1:12], stringsAsFactors=FALSE)
junk$nm[junk$nm == "B"] <- "b"
Option 2:
levels(junk$nm) <- c(levels(junk$nm), "b")
junk$nm[junk$nm == "B"] <- "b"
junk
You can use ifelse
too, which is very simple to understand
junk$val <- ifelse(junk$nm == "B", "b", junk$val)
If you still want to do it through for loop
the correct way of doing it
for(i in 1:nrow(junk)){
if(junk[i, "nm"] == "B"){
junk[i, "val"] <- "b"
}
}
junk
> junk
nm val
1 A a
2 B b
3 C c
4 D d
5 A e
6 B b
7 C g
8 D h
9 A i
10 B b
11 C k
12 D l
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With