I'm trying to overwrite the pandas dataframe in hdf5 file. Each time I do this, the file size grows up while the stored frame content is the same. If I use mode='w' I lost all other records. Is this a bug or am I missing something? <pre class="prettyprint"><code>import pandas df = pandas.read_csv('1.csv') for i in range(100): store = pandas.HDFStore('tmp.h5') store.put('TMP', df) store.close() </code></pre> The tmp.h5 grows in size.

Read the big warning at the bottom of this section This is how HDF5 works.

HDF5 file grows in size after overwriting the pandas dataframe

Tags:

I'm trying to overwrite the pandas dataframe in hdf5 file. Each time I do this, the file size grows up while the stored frame content is the same. If I use mode='w' I lost all other records. Is this a bug or am I missing something?

import pandas
df = pandas.read_csv('1.csv')
for i in range(100):
  store = pandas.HDFStore('tmp.h5')
  store.put('TMP', df)
  store.close()

The tmp.h5 grows in size.

738

asked Oct 13 '15 11:10

Sergey Sergienko

1 Answers

Read the big warning at the bottom of this section

This is how HDF5 works.

123

answered Sep 30 '22 10:09

Jeff

Related questions
                            
                                imresize error using OpenCV 2.4.10 and Python 2.7.10
                            
                                Why is there strange behavior between the IDs of equivalent strings?
                            
                                pyparsing: named results?
                            
                                how do i validate xml against dtd using python?
                            
                                Want to do multi-variation minimize with sympy
                            
                                What's the logic behind this particular Python functions composition?
                            
                                Authenticate to JIRA with gmail in python
                            
                                Filter a numpy array based on largest value
                            
                                pandas equivalent to numpy.roll
                            
                                Install xgboost under python with 32-bit msys failing
                            
                                How to create a title page for a PDF document created using matplotlib
                            
                                Using cursor.execute arguments in pymssql with IN sql statement
                            
                                In IPython Widgets, how to update the DropDown widget with new value?
                            
                                Passing additional arguments to python pandas DataFrame apply
                            
                                How do I print the variable arguments with names from previous stack?
                            
                                Is it possible to run 2 seperate .travis.yml files from the same github repository
                            
                                Download a file to a specific folder with python
                            
                                How to get user posts through facebook-sdk python api?
                            
                                AttributeError: can't set attribute from nltk.book import *
                            
                                With py.test, database is not reset after LiveServerTestCase

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

HDF5 file grows in size after overwriting the pandas dataframe

Tags:

python

pandas

hdf5

pytables

Sergey Sergienko

People also ask

1 Answers

Jeff

Recent Activity

Donate For Us