My current learning goal in R is to avoid for
loops. I very often have to list the files in a directory (or loops through directories) to perform diverse operations on those files.
One example of my task is the following: I have to invoke a system application called cdo
to merge two files. The syntax of this command is, let's say: cdo merge input_file1 input_file2 output_file
.
My current R code looks like this:
# set lists of files
u.files <- c("uas_Amon_ACCESS1-3.nc", "uas_Amon_CMCC-CESM.nc", "uas_Amon_CMCC-CESM.nc")
v.files <- c("vas_Amon_ACCESS1-3.nc", "vas_Amon_CMCC-CESM.nc", "vas_Amon_CMCC-CESM.nc")
for (i in 1:length(u.files)) {
# set input file 1 to use on cdo
input1 <- paste(u.files[i], sep='')
# set input file 2 to use on cdo
input2 <- paste(v.files[i], sep='')
# set output file to use on cdo
output <- paste('output_', u.files[i], sep='')
# assemble the command string
comm <- paste('cdo merge', input1, input2, output, collapse='')
# submit the command
system(comm)
}
which works ok although does not look that good.
However, I often times hear people saying that for
loops in R are slow and should be avoided as much as possible.
Is there any way to avoid the for loops and make the code more efficient/legible in cases like this?
This is more R-idiomatic:
u.files <- c("uas_Amon_ACCESS1-3.nc", "uas_Amon_CMCC-CESM.nc", "uas_Amon_CMCC-CESM.nc")
v.files <- c("vas_Amon_ACCESS1-3.nc", "vas_Amon_CMCC-CESM.nc", "vas_Amon_CMCC-CESM.nc")
output <- paste('output_', u.files, sep='')
comm <- paste('cdo merge', u.files, v.files, output)
lapply(comm,system)
Remember that most functions are vectorized in R, so you don't have to call paste
for each iteration in the loop. At the end you obtain a vector of commands and execute one by one through lapply
in the last line.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With