Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace a given character in a string variable with a character from another string variable of equal length

Tags:

regex

r

I have a data frame with two string variables with an equal number of characters. These strings represent a student responses for some exam. The first string contains a + sign for each question answered correctly and the incorrect response for each incorrect item. The second string contains all the correct answers. I want to replace all the + signs in the first string with the correct answer from the second string. A simplified heuristic data set can be created with this code:

df <- data.frame(v1 = c("+AA+B", "D++CC", "A+BAD"), 
                 v2 = c("DBBAD", "BDCAD","CDCCA"), stringsAsFactors = FALSE)

So the + signs in df$v1 need to be replaced w/ the letters in df$v2 that are the same distance from the start of the string. Any ideas?

like image 404
Braden Avatar asked Dec 02 '22 17:12

Braden


1 Answers

When df$v1 and df$v2 are characters we may use

regmatches(df$v1, gregexpr("\\+", df$v1)) <- regmatches(df$v2, gregexpr("\\+", df$v1))

That is,

df <- data.frame(v1 = c("+AA+B", "D++CC", "A+BAD"), 
                 v2 = c("DBBAD", "BDCAD", "CDCCA"), 
                 stringsAsFactors = FALSE)
rg <- gregexpr("\\+", df$v1)
regmatches(df$v1, rg) <- regmatches(df$v2, rg)
df
#      v1    v2
# 1 DAAAB DBBAD
# 2 DDCCC BDCAD
# 3 ADBAD CDCCA

rg contains the positions of "+" in df$v1, and we conveniently exploit regmatches to replace those matches in df$v1 with whatever is in df$v2 at the same positions.

like image 167
Julius Vainora Avatar answered Dec 19 '22 06:12

Julius Vainora