Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is pickle doing?

I have used Python for years. I have used pickle extensively. I cannot figure out what this is doing:

with codecs.open("huge_picklefile.pc", "rb") as f:
    data = pickle.load(f)
    print(len(data))
    data = pickle.load(f)
    print(len(data))
    data = pickle.load(f)
    print(len(data))

This returns to me:

335
59
12

I am beyond confused. I am use to pickle loading the massive file into memory. The object itself is a massive array of arrays (I assume). Could it be comprised of multiple pickle objects? Unfortunately, I didn't create the pickle object and I don't have access to who did.

I cannot figure out why pickle is splitting up my file into chunks, which isn't the default, and I am not telling it to. What does reloading the same file do? I honestly never tried or even came across a use case until now.

I spent a good 5 hours trying to figure out how to even ask this question on Google. Unsurprisingly, trying "multiple pickle loads on the same document" doesn't yield anything too useful. The Python 3.7 pickle docs does not describe this behavior. I can't figure out how repeatedly loading a pickle document doesn't (a) crash or (b) load the entire thing into memory and then just reference itself. In my 15 years of using python I have never run into this problem... so I am taking a leap of faith that this is just weird and we should probably just use a database instead.

like image 373
Ben Holland Avatar asked Nov 28 '18 23:11

Ben Holland


People also ask

Is Pickle dating a Landry?

dating co-star Chase Landry, which fans also can't get enough of. However, some people are wondering if Pickle is related to Troy (Chase's dad) because they have a close relationship and Pickle is constantly posting about him on social media. But they most certainly are not related.

What does pickle do in Python?

Pickle in Python is primarily used in serializing and deserializing a Python object structure. In other words, it's the process of converting a Python object into a byte stream to store it in a file/database, maintain program state across sessions, or transport data over the network.

How does Troy Landry know pickle?

With Chase being Troy Landry's son, it's likely that Pickle and Troy met through him, which may be how she got her role as the new deckhand. Pickle and Troy may get along well, but it's likely they know each other closely because she is his son's girlfriend, rather than being related.


1 Answers

This file is not quite a pickle file. Someone has dumped multiple pickles into the same file, resulting in the file contents being a concatenation of multiple pickles. When you call pickle.load(f), pickle will read the file from the current file position until it finds a pickle end, so each pickle.load call will load the next pickle.

You can create such a file yourself by calling pickle.dump repeatedly:

with open('demofile', 'wb') as f:
    pickle.dump([1, 2, 3], f)
    pickle.dump([10, 20], f)
    pickle.dump([0, 0, 0], f)
like image 136
user2357112 supports Monica Avatar answered Sep 29 '22 00:09

user2357112 supports Monica