Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using gsub adding new column in a data.table

Tags:

r

data.table

Sorry for a very basic question, solution must be very simple but I'm not able to find it.

Trying to use gsub adding a new column in a data.table, I got the warning "argument 'replacement' has length > 1 and only the first element will be used", and all data.table rows have, in the new column, the value of the first row.

Here is a semplified case:

dt <- data.table(v1=c(1,2,3) , v2=c("axb","cxxd","exfxgx"))  
dt[ , v3:=gsub("x",v1,v2)]  

The new column v3 contains a string with "1" instead of "x" in all the rows.

Using other functions, e.g.

dt[ , v3:=paste(v1,v2)]  

works as expected.

I'm using Rstudio v.0.98.1103 , R v.3.1.2, data.table v.1.9.4

like image 830
mbranco Avatar asked May 13 '15 15:05

mbranco


People also ask

How do I GSUB a column in R?

To remove a character in an R data frame column, we can use gsub function which will replace the character with blank. For example, if we have a data frame called df that contains a character column say x which has a character ID in each value then it can be removed by using the command gsub("ID","",as.

What does gsub () do in R?

The gsub() function in R is used to replace the strings with input strings or values. Note that, you can also use the regular expression with gsub() function to deal with numbers. This is data that has 'R' written multiple times.

What package is GSUB in?

Description Generalized "gsub" and associated functions. gsubfn is an R package used for string matching, substitution and parsing.


2 Answers

dt[, v3 := gsub("x", v1, v2), by = v1]  
like image 73
Quinn Weber Avatar answered Oct 20 '22 08:10

Quinn Weber


The easiest approach would be to use a string processing package that has vectorized arguments, like stringi:

library(stringi)
dt[, v3 := stri_replace_all_fixed(v2, "x", v1)][]
#    v1     v2     v3
# 1:  1    axb    a1b
# 2:  2   cxxd   c22d
# 3:  3 exfxgx e3f3g3

Alternatively, you can make your own "vectorized" version of gsub by using the Vectorize function:

vGsub <- Vectorize(gsub, vectorize.args = c("replacement", "x"))
dt[, v3 := vGsub("x", v1, v2)][]
#    v1     v2     v3
# 1:  1    axb    a1b
# 2:  2   cxxd   c22d
# 3:  3 exfxgx e3f3g3
like image 35
A5C1D2H2I1M1N2O1R2T1 Avatar answered Oct 20 '22 08:10

A5C1D2H2I1M1N2O1R2T1