Is there a pyTables method similar to the following:
with pd.get_store(my_store) as store:
keys = store.keys()
rem_key = min(sorted(keys))
store.remove(rem_key)
I am essentially trying to access the HDF5 store's list of keys, find the one that is no longer desired (in this case it is the min(), if the store keys were dates for example), and then remove that key from the store while preserving the others.
Pandas does not seem to having anything for this and I have looked over pyTables methods to no avail, having read they impact HDF functionality in python.
Thanks!
Pandas does precisely what you want. The remove
function is part of pandas/io/pytables.py
(available for v0.19.1 here) and it will remove a node by key, or rows within a node by a condition.
HDF5 does not adjust the size of your store after removal (see SO answer), so it is advisable to re-compress/restructure your store every now and then. You may do this from the command line using (from SO answer):
ptrepack --chunkshape=auto --propindexes --complib=blosc test.h5 out.h5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With