I am having some problems replacing a value in a column of a dataframe.
I have two dataframes that looks like this:
a results table:
r <- data.frame(d = c("100", "100,111", "100,111,123"), r = c("3", "3,6,7", "42,57"))
a mapping table:
m <- data.frame(id = c("3", "6", "7", "42", "57", "100", "111", "123"), name= c("tc1", "tc2", "tc3", "tc4", "tc5", "tc6", "tc7", "tc8"))
Now I want the strings in m$name
to replace the numbers in r$d
and r$r
based on a match/partial match in m$id
, the hard part being for me, that multiple numbers can appear.
Example: The tuple "100,111" "3,6,7" should be "tc6,tc7" "tc1,tc2,tc3" in the end.
Any help would be highly appreciated.
gsubfn
will replace each match to the pattern in its first argument replacing that match with the value corresponding to that name in the list given in the second argument. We lapply
that to each column of r
.
library(gsubfn)
L <- with(m, as.list(setNames(as.character(name), id)))
replace(r, TRUE, lapply(r, function(x) gsubfn("\\d+", L, as.character(x)))
giving:
d r
1 tc6 tc1
2 tc6,tc7 tc1,tc2,tc3
3 tc6,tc7,tc8 tc4,tc5
If the columns of r
and m
were character rather than factor then we could simplify that a bit.
m[] <- lapply(m, as.character)
r[] <- lapply(r, as.character)
L <- with(m, as.list(setNames(name, id)))
r[] <- lapply(r, gsubfn, pattern = "\\d+", replacement = L)
or use this for the last line if you want to preserve the input r
replace(r, TRUE, lapply(r, gsubfn, pattern = "\\d+", replacement = L))
Here is a one liner using base R,
r[] <- lapply(r, function(i) sapply(strsplit(as.character(i), ','),
function(j)paste(m$name[match(j, m$id)], collapse = ',')))
which gives,
d r 1 tc6 tc1 2 tc6,tc7 tc1,tc2,tc3 3 tc6,tc7,tc8 tc4,tc5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With