A very simple question:
I am writing and running my R scripts using a text editor to make them reproducible, as has been suggested by several members of SO.
This approach is working very well for me, but I sometimes have to perform expensive operations (e.g. read.csv
or reshape
on 2M-row databases) that I'd better cache in the R environment rather than re-run every time I run the script (which is usually many times as I progress and test the new lines of code).
Is there a way to cache what a script does up to a certain point so every time I am only running the incremental lines of code (just as I would do by running R interactively)?
Thanks.
## load the file from disk only if it
## hasn't already been read into a variable
if(!(exists("mytable")){
mytable=read.csv(...)
}
Edit: fixed typo - thanks Dirk.
Some simple ways are doable with some combinations of
exists("foo")
to test if a variable exists, else re-load or re-computefile.info("foo.Rd")$ctime
which you can compare to Sys.time()
and see if it is newer than a given amount of time you can load, else recompute.There are also caching packages on CRAN that may be useful.
After you do something you discover to be costly, save the results of that costly step in an R data file.
For example, if you loaded a csv into a data frame called myVeryLargeDataFrame
and then created summary stats from that data frame into a df called VLDFSummary
then you could do this:
save(c(myVeryLargeDataFrame, VLDFSummary),
file="~/myProject/cachedData/VLDF.RData",
compress="bzip2")
The compress option there is optional and to be used if you want to compress the file being written to disk. See ?save
for more details.
After you save the RData file you can comment out the slow data loading and summary steps as well as the save step and simply load the data like this:
load("~/myProject/cachedData/VLDF.RData")
This answer is not editor dependent. It works the same for Emacs, TextMate, etc. You can save to any location on your computer. I recommend keeping the slow code in your R script file, however, so you can always know where your RData file came from and be able to recreate it from the source data if needed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With