Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

h5py selective read in

Tags:

hdf5

numpy

h5py

I have a problem regarding a selective read-in routine while using h5py.

f = h5py.File('file.hdf5','r')
data = f['Data']

I have several positive values in the 'Data'- dataset and also some placeholders with -9999. How I can get only all positive values for calculations like np.min?

np.ma.masked_array creates a full copy of the array and all the benefits from using h5py are lost ... (regarding memory usage). The problem is, that I get errors if I try to read data sets that exceed 100 millions of values per data set using data = f['Data'][:,0]

Or if this is not possible is something like that possible?

np.place(data[...], data[...] <= -9999, float('nan'))

Thanks in advance

like image 742
nit Avatar asked May 21 '26 08:05

nit


1 Answers

You could use:

mask = f['Data'] >= 0
data = f['Data'][mask]

although I am not sure how much memory the mask calculation itself uses.

like image 180
mtzl Avatar answered May 24 '26 20:05

mtzl



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!