I have a data frame. One of the columns has values like:
WIND
WINDS
HIGH WIND
etc
among the other values. Now I want to rename every value that has some variation of "WIND" in it, with "WIND". I know how to find values that I need to replace:
grep("WIND", df$col1)
but not how to replace those values. Thanks.
replace() function in R Language is used to replace the values in the specified string vector x with indices given in list by those given in values. It takes on three parameters first is the list name, then the index at which the element needs to be replaced, and the third parameter is the replacement values.
Wrapping up. Replacing values in a data frame is a very handy option available in R for data analysis. Using replace() in R, you can switch NA, 0, and negative values with appropriate to clear up large datasets for analysis.
You can just subset the original column for these values by using grepl and replace
df$col1[grepl("WIND",df$col1)]<-"WIND"
UPDATE: a bit of a brainfart, agrep
actually doesn't add anything here over grep, but you can just replace the agrep
with grep
. It does if you have some words that have roots that vary slightly but you still want to match.
Here is an approach using agrep
:
> wind.vec
[1] "WINDS" "HIGH WIND" "WINDY" "VERY WINDY"
> wind.vec[agrep("WIND", wind.vec)] <- "WIND"
> wind.vec
[1] "WIND" "WIND" "WIND" "WIND"
The nice thing about agrep
is it matches approximately, so "WINDY" is replaced. Note I'm doing this with a vector, but you can easily extend to a data frame by replacing wind.vec
with my.data.frame$my.wind.col
.
agrep
returns the indices that match approximately, which then allows me to use the [<-
replacement operator to replace the approximately matching values with "WIND".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With