Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace with multiple elements in a column based on condition

Tags:

replace

search

r

I am having some problems replacing a value in a column of a dataframe.

I have two dataframes that looks like this:

a results table:

r <- data.frame(d = c("100", "100,111", "100,111,123"), r = c("3", "3,6,7", "42,57"))

a mapping table:

m <- data.frame(id = c("3", "6", "7", "42", "57", "100", "111", "123"), name= c("tc1", "tc2", "tc3", "tc4", "tc5", "tc6", "tc7", "tc8"))

Now I want the strings in m$nameto replace the numbers in r$d and r$r based on a match/partial match in m$id, the hard part being for me, that multiple numbers can appear.

Example: The tuple "100,111" "3,6,7" should be "tc6,tc7" "tc1,tc2,tc3" in the end.

Any help would be highly appreciated.

like image 628
goegges Avatar asked Dec 17 '22 14:12

goegges


2 Answers

gsubfn will replace each match to the pattern in its first argument replacing that match with the value corresponding to that name in the list given in the second argument. We lapply that to each column of r.

library(gsubfn)

L <- with(m, as.list(setNames(as.character(name), id)))
replace(r, TRUE, lapply(r, function(x) gsubfn("\\d+", L, as.character(x)))

giving:

            d           r
1         tc6         tc1
2     tc6,tc7 tc1,tc2,tc3
3 tc6,tc7,tc8     tc4,tc5

Note

If the columns of r and m were character rather than factor then we could simplify that a bit.

m[] <- lapply(m, as.character)
r[] <- lapply(r, as.character)

L <- with(m, as.list(setNames(name, id)))
r[] <- lapply(r, gsubfn, pattern = "\\d+", replacement = L)

or use this for the last line if you want to preserve the input r

replace(r, TRUE, lapply(r, gsubfn, pattern = "\\d+", replacement = L))
like image 195
G. Grothendieck Avatar answered Jan 25 '23 23:01

G. Grothendieck


Here is a one liner using base R,

r[] <- lapply(r, function(i) sapply(strsplit(as.character(i), ','), 
                                function(j)paste(m$name[match(j, m$id)], collapse = ',')))

which gives,

            d           r
1         tc6         tc1
2     tc6,tc7 tc1,tc2,tc3
3 tc6,tc7,tc8     tc4,tc5
like image 45
Sotos Avatar answered Jan 25 '23 23:01

Sotos