If my data frame (df) looks like this:
Name State John Smith MI John Smith WI Jeff Smith WI
I want to rename the John Smith from WI "John Smith1". What is the cleanest R equivalent of the SQL statement?
update df set Name = "John Smith1" where Name = "John Smith" and State = "WI"
In such a case, you can use the following UPDATE statement syntax to update column from one table, based on value of another table. UPDATE first_table, second_table SET first_table. column1 = second_table. column2 WHERE first_table.id = second_table.
Combining INDEX and MATCH to Lookup Value in Column and Return Value of Another Column. Apart from that, you can use nested INDEX and MATCH formulas to lookup for a value in a column and get the result of another column in the dataset.
UPDATE syntax: UPDATE table_name SET column_name = value WHERE condition; To perform the above function, we can set the column name to be equal to the data present in the other table, and in the condition of the WHERE clause, we can match the ID.
df <- data.frame(Name=c('John Smith', 'John Smith', 'Jeff Smith'), State=c('MI','WI','WI'), stringsAsFactors=F) df <- within(df, Name[Name == 'John Smith' & State == 'WI'] <- 'John Smith1') > df Name State 1 John Smith MI 2 John Smith1 WI 3 Jeff Smith WI
** Edit **
Edited to add that you can put whatever you like in the within expression:
df <- within(df, { f <- Name == 'John Smith' & State == 'WI' Name[f] <- 'John Smith1' State[f] <- 'CA' })
One way:
df[df$Name == "John_Smith" & df$State == "WI", "Name"] <- "John_Smith1"
Another way using the dplyr
:
df %>% mutate(Name = ifelse(State == "WI" & Name == "John_Smith", "John_Smith1", Name))
Note: As David Arenburg says, the first column should not be a factor. For this, reading the data set stringsAsFactors = FALSE
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With