I read in a public-use dataset that created dozens of temporary vectors in the process of building a final dataframe. Since this dataframe will be analyzed as part of a larger process, I plan on source
ing the R script that creates the dataframe, but do not want to leave myself or future users with a cluttered global environment.
I know that I can use ls
to list the current objects in my global environment and use rm
to remove certain objects, but I'm unsure of how to use those two functions in concert to remove all objects except the dataframe created by a certain script.
To clarify, here is a reproducible example:
Script 1, named "script1.R"
setwd("C:/R/project")
set.seed(12345)
var <- letters
for (i in var) {
assign(i, runif(1))
}
df <- data.frame(x1 = a, x2 = b, x3 = c)
Script 2
source("script1.r")
It would be easy enough to remove all vectors from the source
d script by some combination of rm
, ls
with pattern = letters
or something like that, but what I want to do is create a general function that removes ALL vectors created by a certain script and only retain the dataframe (in this example, df
).
(NOTE: There are similar questions as this here and here, but I feel mine is different in that it is more specific to sourcing and cleaning in the context of a multi-script project).
Update While looking around, the following link gave me a nice work around:
How can I neatly clean my R workspace while preserving certain objects?
Specifically, user @Fojtasek suggested:
I would approach this by making a separate environment in which to store all the junk variables, making your data frame using with(), then copying the ones you want to keep into the main environment. This has the advantage of being tidy, but also keeping all your objects around in case you want to look at them again.
So I could just append the source code that creates the dataframe as follows...
temp <- new.env()
with(temp, {
var <- letters
for (i in var) {
assign(i, runif(1))
}
df <- data.frame(x1 = a, x2 = b, x3 = c)
}
... and then just extract the desired dataframe (df
) to my global environment, but I'm curious if there are other elegant solutions, or if I'm thinking about this incorrectly.
Thanks.
remove and rm can be used to remove objects. These can be specified successively as character strings, or in the character vector list , or through a combination of both. All objects thus specified will be removed. If envir is NULL then the the currently active environment is searched first.
rm() function in R Language is used to delete objects from the memory. It can be used with ls() function to delete all objects. remove() function is also similar to rm() function.
Actually, there are two different functions that can be used for clearing specific data objects from the R workspace: rm() and remove(). However, these two functions are exactly the same. You can use the function you prefer. The previous R code also clears the data object x from the R workspace.
In RStudio, ensure the Environment tab is in Grid (not List ) mode. Tick the object(s) you want to remove from the environment. Click the broom icon.
As an alternative approach (similar to @Ken's suggestion from the comments), the following code allows you to delete all objects created after a certain point, except one (or more) that you specify:
freeze <- ls() # all objects created after here will be deleted
var <- letters
for (i in var) {
assign(i, runif(1))
}
df <- data.frame(x1 = a, x2 = b, x3 = c)
rm(list = setdiff(ls(), c(freeze, "df"))) #delete old objects except df
The workhorse here is setdiff()
, which will return a list a list of the items that appear in the first list but not the second. In this case, all items created after freeze
except df
. As an added bonus, freeze
is deleted here as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With