Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

gsub() on multiple dataframes in loop/lapply

Tags:

r

gsub

I have two dataframes with a column named 'Title' in each, containing string. I need to reduce these strings in order to merge them. Now I want to make this as clean as possible in a loop such that I only have to write the gsub-function once.

Let's say I have:

df_1 <-read.table(text="
id Title
1 some_average_title
2 another:_one
3 the_third!
4 and_'the'_last
",header=TRUE,sep="")

and:

df_2 <-read.table(text="
id Title
1 some_average.title
2 another:one
3 the_third
4 and_the_last
",header=TRUE,sep="")

I would now run:

df_1$Title <- gsub(" |\\.|'|:|!|\\'|_", "", df_1$Title )
df_2$Title <- gsub(" |\\.|'|:|!|\\'|_", "", df_2$Title )

I tried the following loop:

for (dtfrm in c("dt_1", "df_2")) {
  assign(paste0(dtfrm, "$Title"),
    gsub(" |\\.|'|:|!|\\'|", "", get(paste0(dtfrm, "$Title")))
    )
  }

but it doesn't work - despite the lack of error messages.

I was also thinking about lapply(list(dt_1, dt_2), function(w){ w$Title <- XXX })but I don't know what to put for XXX because gsub()needs as a third argument the list of strings.

like image 757
MERose Avatar asked Oct 31 '22 16:10

MERose


1 Answers

This works:

for(df in c("df_1", "df_2")){
  assign(df, transform(get(df), Title =  gsub(" |\\.|'|:|!|\\'|_", "", Title)))
}

Testing:

df_1
  id            Title
1  1 someaveragetitle
2  2       anotherone
3  3         thethird
4  4       andthelast

And:

  df_2
  id            Title
1  1 someaveragetitle
2  2       anotherone
3  3         thethird
4  4       andthelast
like image 94
Carlos Cinelli Avatar answered Nov 15 '22 05:11

Carlos Cinelli