Noob here to R. Trying to figure something out. I need to build a function that adds a new column to the beginning of a dataset. This new column is a concatenation of the values in other columns that the user specifies.
Imagine this is the data set named myDataSet:
col_1 col_2 col_3 col_4
bat red 1 a
cow orange 2 b
dog green 3 c
The user could use the function like so:
addPrimaryKey(myDataSet, cols=c(1,3,4))
to get the result of a new data set with columns 1, 3 and 4 concatenated into a column called ID and added to the beginning, like so:
ID col_1 col_2 col_3 col_4
bat1a bat red 1 a
cow2b cow orange 2 b
dog4c dog green 3 c
This is the script I have been working on but I have been staring at it so long, I think I have made a few mistakes. I can't figure out how to get the column numbers from the arguments into the paste function properly.
addPrimaryKey <- function(df, cols=NULL){
newVector = rep(NA, length(cols)) ##initialize vector to length of columns
colsN <- as.numeric(cols)
df <- cbind(ID=paste(
for(i in 1:length(colsN)){
holder <- df[colsN[i]]
holder
}
, sep=""), df) ##concatenate the selected columns and add as ID column to df
df
}
Any help would be greatly appreciated. Thanks so much
paste0
works fine, with some help from do.call
:
do.call(paste0, mydf[c(1, 3, 4)])
# [1] "bat1a" "cow2b" "dog3c"
Your function, thus, can be something like:
addPrimaryKey <- function(inDF, cols) {
cbind(ID = do.call(paste0, inDF[cols]),
inDF)
}
You may also want to look at interaction
:
interaction(mydf[c(1, 3, 4)], drop=TRUE)
# [1] bat.1.a cow.2.b dog.3.c
# Levels: bat.1.a cow.2.b dog.3.c
This should do the trick
addPrimaryKey <-function(df, cols){
q<-apply(df[,cols], 1, function(x) paste(x, collapse=""))
df<-cbind(q, df)
return(df)
}
Just add in some conditional logic for your nulls
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With