Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Similar .rdata functionality in Python?

Tags:

python

r

I'm starting to learn about doing data analysis in Python.

In R, you can load data into memory, then save variables into a .rdata file.

I'm trying to create an analysis "project", so I can load the data, store the scripts, then save the output so I can recall it should I need to.

Is there an equivalent function in Python?

Thanks

like image 352
mikebmassey Avatar asked Jan 07 '12 20:01

mikebmassey


2 Answers

json
pickle

like image 36
Ignacio Vazquez-Abrams Avatar answered Sep 18 '22 21:09

Ignacio Vazquez-Abrams


What you're looking for is binary serialization. The most notable functionality for this in Python is pickle. If you have some standard scientific data structures, you could look at HDF5 instead. JSON works for a lot of objects as well, but it is not binary serialization - it is text-based.

If you expand your options, there are a lot of other serialization options, too. Such as Google's Protocol Buffers (the developer of Rprotobuf is the top-ranked answerer for the r tag on SO), Avro, Thrift, and more.

Although there are generic serialization options, such as pickle and .Rdat, careful consideration of your usage will be helpful in making I/O fast and appropriate to your needs, especially if you need random access, portability, parallel access, tool re-use, etc. For instance, I now tend to avoid .Rdat for large objects.

like image 106
Iterator Avatar answered Sep 18 '22 21:09

Iterator