I'm trying to figure out how saving works in R Studio.
When i create a new project, a .RProj file is created. Whenever I work in R Studio, Save and Save As are greyed out in the File menu. The only way I know how to create a .RProj file is when starting a new project.
In the environment section, I can see a floppy disk Save icon. When I click that, it creates a .RData file. When ever I want to save, I click on that save icon and overwrite the file.
Can someone please explain what the best practices are for saving when using R Studio and the key distinctions between the .RProj and .RData files?
You should probably read Using Projects - RStudio Support. R Projects are super useful, but they are not meant for saving data from your R environment. They are exclusively used by the RStudio code editor. One of the nicest things they do is automatically set your working directory to the project directory when you open one. They also remember what files you had open in RStudio, and other editing-related preferences and such. Definitely use RProjects!
.RData
is a file of R objects. You can create an R data file from within R (not just RStudio) using the save()
command and later load them back into your workspace with load()
. You can save all the objects in your workspace (save.image
does this automatically - it's a wrapper around save()
) or only specific objects. See ?save
for details. (For single objects, .rds files created with saveRDS
are preferred.)
For many years (since long before RStudio came to be) the default RGui has given the option to save all the objects in your workspace to an .RData file on exit. RStudio also gives this option (unless you turn it off).
The diskette "save" icon at the top of your the editor pane in RStudio does not save R objects, it saves only the code you have written in your scripts. The "Environment" tab also has a diskette save icon, which will save R objects.
This gets into opinions of style; there is no definitive answer. My personal preference is to never do blanket save of all objects in my workspace because it enables a bad habit of not keeping the code needed to create those objects. I save all my scripts, and if a particular object(s) takes a long time to create, I will script the saving of it -
saveRDS(object = final_model, file = "final_model.rds")
I treat a model or a cleaned data set much like a nice plot in code - keep the code to make it in case you want to tweak it, but save the output to a file so you don't have to run the code to recreate it every time you want to look at it.
For larger projects I try to keep the scope of an individual script small and I often number scripts (in the order I'd want to run them to start from the beginning) as suggested by answers to Workflow for statistical analysis and report writing. Most scripts begin by reading in objects they depend on and end by saving their outputs.
The function save()
creates an a representation of your R objects to a specified file. Later, the objects can be read back from the specified with the functions load()
, attach()
or data()
in some cases, such for R's built in datasets.
It permits to save the objects and functions that you have created in an .RData file. It is very important to include the .RData extension when indicating the file path. The help
file will provide you further details.
RStudio projects allow to divide your work into multiple contexts, each with their own working directory, workspace, history, and source documents. The Create Project command allows you to create a project in a new or existing directory. .RData are written by default to the project directory. It is an useful tool for workspace management. You can find a full detailed description of projects' features https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With