What's the fastest way to pickle a pandas DataFrame?

Tags:

Which is better, using Pandas built-in method or pickle.dump?

The standard pickle method looks like this:

pickle.dump(my_dataframe, open('test_pickle.p', 'wb'))

The Pandas built-in method looks like this:

my_dataframe.to_pickle('test_pickle.p')

656

asked Feb 26 '15 23:02

tegan

1 Answers

Thanks to @qwwqwwq I discovered that pandas has a built-in to_pickle method for dataframes. I did a quick time test:

In [1]: %timeit pickle.dump(df, open('test_pickle.p', 'wb'))
10 loops, best of 3: 91.8 ms per loop

In [2]: %timeit df.to_pickle('testpickle.p')
10 loops, best of 3: 88 ms per loop

So it seems that the built-in is only narrowly better (to me, this is useful because it means it's probably not worth refactoring code to use the built-in) - hope this helps someone!

168

answered Oct 10 '22 12:10

tegan

Related questions
                            
                                DNS over proxy?
                            
                                How can I get the color of the last figure in matplotlib?
                            
                                Finding the indices of the top three values via argmin() or min() in python/numpy without mutation of list?
                            
                                Performing len on list of a zip object clears zip [duplicate]
                            
                                How to post data structure like json to flask?
                            
                                reverse() argument after ** must be a mapping
                            
                                One-sided Wilcoxon signed-rank test using scipy
                            
                                matplotlib: update position of patches (or: set_xy for circles)
                            
                                Cython, Python and KeyboardInterrupt ignored
                            
                                Is a specific timezone using DST right now?
                            
                                Correct way to check for empty or missing file in Python
                            
                                PyQt: Connecting a signal to a slot to start a background operation
                            
                                subprocess.Popen stdin read file
                            
                                Pyinstaller with relative imports
                            
                                Reinstall virtualenv with tox when requirements.txt or setup.py changes
                            
                                Radar chart with multiple scales on multiple axes
                            
                                Python heapq vs. sorted complexity and performance
                            
                                Pandas lookup, mapping one column in a dataframe to another in a different dataframe
                            
                                Celery: why do I need a broker for periodic tasks?
                            
                                using APITestCase with django-rest-framework

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What's the fastest way to pickle a pandas DataFrame?

Tags:

python

pandas

pickle

tegan

People also ask

1 Answers

tegan

Recent Activity

Donate For Us