I'm starting to learn about doing data analysis in Python.
In R, you can load data into memory, then save variables into a .rdata
file.
I'm trying to create an analysis "project", so I can load the data, store the scripts, then save the output so I can recall it should I need to.
Is there an equivalent function in Python?
Thanks
json
pickle
What you're looking for is binary serialization. The most notable functionality for this in Python is pickle
. If you have some standard scientific data structures, you could look at HDF5 instead. JSON works for a lot of objects as well, but it is not binary serialization - it is text-based.
If you expand your options, there are a lot of other serialization options, too. Such as Google's Protocol Buffers (the developer of Rprotobuf
is the top-ranked answerer for the r tag on SO), Avro, Thrift, and more.
Although there are generic serialization options, such as pickle
and .Rdat
, careful consideration of your usage will be helpful in making I/O fast and appropriate to your needs, especially if you need random access, portability, parallel access, tool re-use, etc. For instance, I now tend to avoid .Rdat
for large objects.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With