I have a data frame where I want to add another column that's a result of computation involving 3 other columns. The method I am using right now seems to be very slow. Is there any better method to do the same. Here is the approach I am using.
library(bitops)
GetRes<-function(A, B, C){
tagU <- bitShiftR((A*C), 4)
tagV <- bitShiftR(B, 2)
x<-tagU %% 2
y<-tagV %% 4
res<-(2*x + y) %% 4
return(res)
}
df <- data.frame(id=letters[1:3],val0=1:3,val1=4:6,val2=7:9)
apply(df, 1, function(x) GetRes(x[2], x[3], x[4]))
My data frame is very big and it's taking ages to get this computation done. Can someone suggest me to do it better?
Thanks.
Try mapply
mapply(GetRes, df[,2], df[,3], df[,4])
If you let us know which package bitShiftR
is from, we can test it on bigger data to see if there is any performance boost.
UPDATE
Quick benchmarking shows, mapply
is twice as fast as your apply
microbenchmark(apply(df[,2:4], 1, function(x) GetRes(x[1], x[2], x[3])), mapply(GetRes, df[,2], df[,3], df[,4]))
Unit: microseconds
expr min lq median uq max neval
apply(df[, 2:4], 1, function(x) GetRes(x[1], x[2], x[3])) 196.985 201.6200 206.7515 216.187 1006.775 100
mapply(GetRes, df[, 2], df[, 3], df[, 4]) 99.982 105.6105 108.7560 112.232 149.311 100
Everything you're doing is already vectorized which is much faster than any other alternative you'll be offered. You can just call this...
with(df, GetRes(val0, val1, val2))
or this
GetRes(df$val0, df$val1, df$val2)
or this
GetRes(df[,2], df[,3], df[,4])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With