Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

speeding up running if.. else loop in R

Tags:

r

Does anyone know how to speed up running the following command? I want to replace the numerical "month" values with a character string ... e.g. month 1 goes to "Jul".

This command is really really slow as the dataframe I trying to implement it on is enormous!

for (i in 1:length(CO2$month)){
    if(CO2$month[i]=='1') {CO2$months[i]<-'Jul'} else
    if(CO2$month[i]=='2') {CO2$months[i]<-'Aug'} else
    if(CO2$month[i]=='3') {CO2$months[i]<-'Sept'} else
    if(CO2$month[i]=='4') {CO2$months[i]<-'Oct'} else
    if(CO2$month[i]=='5') {CO2$months[i]<-'Nov'} else
    if(CO2$month[i]=='6') {CO2$months[i]<-'Dec'} else
    if(CO2$month[i]=='7') {CO2$months[i]<-'Jan'} else
    if(CO2$month[i]=='8') {CO2$months[i]<-'Feb'} else
    if(CO2$month[i]=='9') {CO2$months[i]<-'Mar'} else
    if(CO2$month[i]=='10') {CO2$months[i]<-'Apr'} else
    if(CO2$month[i]=='11') {CO2$months[i]<-'May'} else
    if(CO2$month[i]=='12') {CO2$months[i]<-'Jun'}
}
like image 655
ThallyHo Avatar asked Nov 02 '12 16:11

ThallyHo


People also ask

WHY IS for loop so slow in R?

Loops are slower in R than in C++ because R is an interpreted language (not compiled), even if now there is just-in-time (JIT) compilation in R (>= 3.4) that makes R loops faster (yet, still not as fast). Then, R loops are not that bad if you don't use too many iterations (let's say not more than 100,000 iterations).

Why is my R code taking so long?

There is a lot of overhead in the processing because R needs to check the type of a variable nearly every time it looks at it. This makes it easy to change types and reuse variable names, but slows down computation for very repetitive tasks, like performing an action in a loop.

Is apply faster than for loop in R?

The apply functions (apply, sapply, lapply etc.) are marginally faster than a regular for loop, but still do their looping in R, rather than dropping down to the lower level of C code.


3 Answers

You can do it without a loop and without if-else:

set.seed(21)
CO2 <- data.frame(month=as.character(sample(1:12,24,TRUE)),
  stringsAsFactors=FALSE)
MonthAbbRotated <- month.abb[c(7:12,1:6)]
CO2$months <- MonthAbbRotated[as.numeric(CO2$month)]

If your month column isn't really character, this is even easier:

set.seed(21)
CO2 <- data.frame(month=sample(1:12,24,TRUE))
MonthAbbRotated <- month.abb[c(7:12,1:6)]
CO2$months <- MonthAbbRotated[CO2$month]
like image 162
Joshua Ulrich Avatar answered Nov 15 '22 04:11

Joshua Ulrich


I could be missing something, but why not just use a factor?

CO2$month <- factor(CO2$month, levels=1:12, labels=c("Jul","Aug","Sept","Oct","Nov","Dec","Jan","Feb","Mar","Apr","May","Jun"))

like image 25
frankc Avatar answered Nov 15 '22 05:11

frankc


month =c("jul","aug","sep","oct","nov","dec","jan","feb","mar","apr","may","jun")

for (i in 1:length(CO2$month)){ CO2$month[i] = month[as.integer(CO2$month[i])]}
like image 24
AGS Avatar answered Nov 15 '22 04:11

AGS