This is a question from a lazy man.
I have 4 million rows of pandas DataFrame and would like to save them into smaller chunks of pickle files.
Why smaller chunks? To save/load them quicker.
My question is: 1) Is there a better way (in-built function) to save them in smaller pieces than manually chunking them using np.array_split?
2) Is there any graceful way of gluing them together when I read the chunks other than manually gluing them together?
Please Feel free to suggest any other data type suited for this job other than pickle.
If the goal is to save and load quickly you should look into using sql rather than raw text pickling. If your computer chokes when you ask it to write 4 million rows you can specify a chunk size.
From there you can query slices with std. SQL.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With