Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concatenate columns and add them to beginning of Data Frame

Tags:

dataframe

r

Noob here to R. Trying to figure something out. I need to build a function that adds a new column to the beginning of a dataset. This new column is a concatenation of the values in other columns that the user specifies.

Imagine this is the data set named myDataSet:

col_1    col_2    col_3    col_4
bat      red      1        a
cow      orange   2        b
dog      green    3        c

The user could use the function like so:

addPrimaryKey(myDataSet, cols=c(1,3,4))

to get the result of a new data set with columns 1, 3 and 4 concatenated into a column called ID and added to the beginning, like so:

ID        col_1    col_2    col_3    col_4
bat1a     bat      red      1        a
cow2b     cow      orange   2        b
dog4c     dog      green    3        c

This is the script I have been working on but I have been staring at it so long, I think I have made a few mistakes. I can't figure out how to get the column numbers from the arguments into the paste function properly.

addPrimaryKey <- function(df, cols=NULL){

  newVector = rep(NA, length(cols)) ##initialize vector to length of columns

  colsN <- as.numeric(cols)

  df <- cbind(ID=paste(
    for(i in 1:length(colsN)){
      holder <- df[colsN[i]]
      holder
    }
  , sep=""), df) ##concatenate the selected columns and add as ID column to df
df
}

Any help would be greatly appreciated. Thanks so much

like image 280
Crayon Constantinople Avatar asked Dec 09 '22 09:12

Crayon Constantinople


2 Answers

paste0 works fine, with some help from do.call:

do.call(paste0, mydf[c(1, 3, 4)])
# [1] "bat1a" "cow2b" "dog3c"

Your function, thus, can be something like:

addPrimaryKey <- function(inDF, cols) {
  cbind(ID = do.call(paste0, inDF[cols]),
        inDF)
}

You may also want to look at interaction:

interaction(mydf[c(1, 3, 4)], drop=TRUE)
# [1] bat.1.a cow.2.b dog.3.c
# Levels: bat.1.a cow.2.b dog.3.c
like image 162
A5C1D2H2I1M1N2O1R2T1 Avatar answered May 21 '23 23:05

A5C1D2H2I1M1N2O1R2T1


This should do the trick

addPrimaryKey <-function(df, cols){

   q<-apply(df[,cols], 1, function(x) paste(x, collapse=""))

   df<-cbind(q, df)

   return(df)

}

Just add in some conditional logic for your nulls

like image 39
Simon Raper Avatar answered May 21 '23 22:05

Simon Raper