Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas dataframe in mixed mode can't serialize to hdf5?

In Pandas it seems I can't store a dataframe of mixed types:

store = HDFStore('play.h5')
df = DataFrame([{'a': 1, 'b': 'hello'}, {'a': 5, 'b': 'world'}])
store.put('df', df, table=True, compression='zlib')

This gives an Exception: Cannot currently store mixed-type DataFrame objects in Table format

Is this due to some inherent limitation of Pandas or just a future nice-to-have? It seems that HDFStore would not be very useful with this limitation, as many dataframes will be mixed-type.

like image 670
David van Coevorden Avatar asked Dec 02 '25 06:12

David van Coevorden


1 Answers

The Table format stores all of the data in record form, i.e. all of the values are stored in a single column. There's an alternate table format that is possible to use (one column per DataFrame column), but I haven't implemented that yet. Basically the table format is designed to support queries

Mixed-type DataFrame can be stored if you do table=False, though. Would welcome more work on these features.

like image 73
Wes McKinney Avatar answered Dec 03 '25 21:12

Wes McKinney



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!