The question may seem a little basic, but wasn't able to find anything that I understood in the internet. How do I store something that I pickled with dill?
I have come this far for saving my construct (pandas DataFrame, which also contains custom classes):
import dill
dill_file = open("data/2017-02-10_21:43_resultstatsDF", "wb")
dill_file.write(dill.dumps(resultstatsDF))
dill_file.close()
and for reading
dill_file = open("data/2017-02-10_21:43_resultstatsDF", "rb")
resultstatsDF_out = dill.load(dill_file.read())
dill_file.close()
but I when reading I get the error
TypeError: file must have 'read' and 'readline' attributes
How do I do this?
EDIT for future readers: After having used this approach (to pickle my DataFrame) for while, now I refrain from doing so. As it turns out, different program versions (including objects that might be stored in the dill file) might result in not being able to recover the pickled file. Now I make sure that everything that I want to save, can be expressed as a string (as efficiently as possible) -- actually a human readable string. Now, I store my data as CSV. Objects in CSV-cells might be represented by JSON format. That way I make sure that my files will be readable in the months and years to come. Even if code changes, I am able to rewrite encoders by parsing the strings and I am able to understand the CSV my inspecting it manually.
To save a pickle, use pickle. dump . A convention is to name pickle files *. pickle , but you can name it whatever you want.
Python Pickle load You have to use pickle. load() function to do that. The primary argument of pickle load function is the file object that you get by opening the file in read-binary (rb) mode. Simple!
Pickling Files To use pickle, start by importing it in Python. To pickle this dictionary, you first need to specify the name of the file you will write it to, which is dogs in this case. Note that the file does not have an extension. To open the file for writing, simply use the open() function.
Pickling is alternatively known as "serialization", "marshalling", or "flattening". To serialize (pickle) an object hierarchy, you simply call the dump() function. Similarly, to de-serialize a data stream, you call the load() function. You can pickle any object like integers, strings, tuples, lists, dictionaries, etc.
Just give it the file without the read
:
resultstatsDF_out = dill.load(dill_file)
you can also dill to file like this:
with open("data/2017-02-10_21:43_resultstatsDF", "wb") as dill_file:
dill.dump(resultstatsDF, dill_file)
So:
dill.dump(obj, open_file)
writes to a file directly. Whereas:
dill.dumps(obj)
serializes obj
and you can write it to file yourself.
Likewise:
dill.load(open_file)
reads from a file, and:
dill.loads(serialized_obj)
constructs an object form a serialized object, which you could read from a file.
It is recommended to open a file using the with
statement.
Here:
with open(path) as fobj:
# do somdthing with fobj
has the same effect as:
fobj = open(path)
try:
# do somdthing with fobj
finally:
fobj.close()
The file will be closed as soon as you leave the indention of the with
statement, even in the case of an exception.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With