Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concatenate pairs of variables with same suffix

Tags:

for-loop

r

I have a data frame that has a number of variables in it that I want to concatenate into new variables in that same data frame. A simplified version of my data frame df looks like this:

first.1 second.1 first.2 second.2 
1222 3223 3333 1221 
1111 2212 2232 2113 

Here is how I do it inefficiently without a for loop:

df$concatenated.1 <- paste0(df$first.1,"-",df$second.1)
df$concatenated.2 <- paste0(df$first.2,"-",df$second.2)

Which results in the following data frame df:

first.1 second.1 first.2 second.2 concatenated.1 concatenated.2 
1222 3223 3333 1221 1222-3223 3333-1221 
1111 2212 2232 2113 1111-2212 2232-2113 

I have a lot more than 2 pairs of variables to concatenate, so I would like to do this in a for loop:

for (i in 1:2){
??
}

Any ideas on how to accomplish this?

like image 729
Abdel Avatar asked Dec 30 '18 15:12

Abdel


2 Answers

If your real data has names which follow a clear pattern as in this example data, Ronak's split / lapply answer is probably best. If not, you can just create vectors of the names and use Map with paste.

new.names <- paste0('concatenated.', 1:2)
names.1 <- paste0('first.', 1:2)
names.2 <- paste0('second.', 1:2)

df[new.names] <- Map(paste, df[names.1], df[names.2], sep = '-')

df

#   first.1 second.1 first.2 second.2 concatenated.1 concatenated.2
# 1    1222     3223    3333     1221      1222-3223      3333-1221
# 2    1111     2212    2232     2113      1111-2212      2232-2113
like image 158
IceCreamToucan Avatar answered Oct 22 '22 21:10

IceCreamToucan


If you could figure out a way to split your columns then it would be much easier. For example, based on provided example we can split columns based on last characters of column names (1, 1, 2, 2).

Using base R we use split.default to split the columns based on names (as described above) and for every group we paste each row and add new columns.

group_names <- substring(names(df), nchar(names(df)))
df[paste0("concatenated.", unique(group_names))] <- 
     lapply(split.default(df,group_names),  function(x)  do.call(paste, c(x, sep = "-")))

df
#  first.1 second.1 first.2 second.2 concatenated.1 concatenated.2
#1    1222     3223    3333     1221      1222-3223      3333-1221
#2    1111     2212    2232     2113      1111-2212      2232-2113
like image 3
Ronak Shah Avatar answered Oct 22 '22 22:10

Ronak Shah