Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deleting a key/table in an HDF Store with Python

Is there a pyTables method similar to the following:

    with pd.get_store(my_store) as store:
        keys = store.keys()
        rem_key = min(sorted(keys))
        store.remove(rem_key)

I am essentially trying to access the HDF5 store's list of keys, find the one that is no longer desired (in this case it is the min(), if the store keys were dates for example), and then remove that key from the store while preserving the others.

Pandas does not seem to having anything for this and I have looked over pyTables methods to no avail, having read they impact HDF functionality in python.

Thanks!

like image 521
KidMcC Avatar asked Nov 02 '15 23:11

KidMcC


1 Answers

Pandas does precisely what you want. The remove function is part of pandas/io/pytables.py (available for v0.19.1 here) and it will remove a node by key, or rows within a node by a condition.

HDF5 does not adjust the size of your store after removal (see SO answer), so it is advisable to re-compress/restructure your store every now and then. You may do this from the command line using (from SO answer):

ptrepack --chunkshape=auto --propindexes --complib=blosc test.h5 out.h5
like image 170
0_0 Avatar answered Oct 27 '22 22:10

0_0