I've noticed that R
keeps the index from for
loops stored in the global environment, e.g.:
for (ii in 1:5){ }
print(ii)
# [1] 5
Is it common for people to have any need for this index after running the loop?
I never use it, and am forced to remember to add rm(ii)
after every loop I run (first, because I'm anal about keeping my namespace clean and second, for memory, because I sometimes loop over lists of data.tables
--in my code right now, I have 357MB-worth of dummy variables wasting space).
Is there an easy way to get around this annoyance?
Perfect would be a global option to set (a la options(keep_for_index = FALSE)
; something like for(ii in 1:5, keep_index = FALSE)
could be acceptable as well.
I agree with the comments above. Even if you have to use for
loop (using just side effects, not functions' return values) it would be a good idea to structure
your code in several functions and store your data in lists.
However, there is a way to "hide" index and all temporary variables inside the loop - by calling the for
function in a separate environment:
do.call(`for`, alist(i, 1:3, {
# ...
print(i)
# ...
}), envir = new.env())
But ... if you could put your code in a function, the solution is more elegant:
for_each <- function(x, FUN) {
for(i in x) {
FUN(i)
}
}
for_each(1:3, print)
Note that with using "for_each"-like construct you don't even see the index variable.
In order to do what you suggest, R would have to change the scoping rules for for
loops. This will likely never happen because i'm sure there is code out there in packages that rely on it. You may not use the index after the for loop, but given that loops can break()
at any time, the final iteration value isn't always known ahead of time. And having this as a global option again would cause problems with existing code in working packages.
As pointed out, it's for more common to use sapply or lapply loops in R. Something like
for(i in 1:4) {
lm(data[, 1] ~ data[, i])
}
becomes
sapply(1:4, function(i) {
lm(data[, 1] ~ data[, i])
})
You shouldn't be afraid of functions in R. After all, R is a functional language.
It's fine to use for
loops for more control, but you will have to take care of removing the indexing variable with rm()
as you've pointed out. Unless you're using a different indexing variable in each loop, i'm surprised that they are piling up. I'm also surprised that in your case, if they are data.tables, they they are adding additional memory since data.tables don't make deep copies by default as far as i know. The only memory "price" you would pay is a simple pointer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With