Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HDF5 how to handle empty rows

Tags:

python

hdf5

h5py

I want to pass some values given by a MOCAP sensor to a hdf5 file. So, in order to simplify things, let's say I have a table like the next:

| time |  x1 |  y1 |  x2 |  y2 |
|    0 | 2.0 | 1.0 | 2.0 | 3.0 |
|    1 | 2.1 | 1.0 | 2.3 | 3.1 |
|    2 | 2.4 | 1.4 |     |     |
|    3 | 2.2 | 1.5 | 2.4 | 3.1 |
|    4 |     |     | 2.3 | 3.2 |

I have some empty spaces because my sensor is not able to read information of certain body at certain time. So my question is, how can I handle in a single dataset this empty information?

By using csv format I can just ignore the information by just not writing any value between 2 commas. I'm using h5py with python. As a note, I have positive and negative numbers.

Actually the question would it be if there's a better or more proper way other than putting NaN in the field.

like image 900
silgon Avatar asked Oct 31 '25 15:10

silgon


1 Answers

I think you are totally correct in using a NaN.

I would set a fill value and use that, in doing so I would use NaN or None.

#!/usr/bin/env python

import numpy as np
import h5py as h5

f = h5.File('test.h5','w')

ctype = np.dtype([('time','i'),
                  ('x1','f8'),('y1','f8'),
                  ('x2','f8'),('y2','f8')])

d = f.create_dataset('test', (5,), dtype=ctype)
d.set_fill_value = np.nan

data = np.array([(0, 2.0,    1.0,    2.0,    3.0),
                 (1, 2.1,    1.0,    2.3,    3.1),
                 (2, 2.4,    1.4,    np.nan, np.nan),
                 (3, 2.2,    1.5,    2.4,    3.1),
                 (4, np.nan, np.nan, 2.3,    3.2)],
                 dtype = ctype)
d[...] = data
f.close()

Then if we run it and look at the file it produces.

localhost ~$ ./test.py
localhost ~$ h5dump test.h5
 h5dump test.h5 
HDF5 "test.h5" {
GROUP "/" {
   DATASET "test" {
      DATATYPE  H5T_COMPOUND {
         H5T_STD_I32LE "time";
         H5T_IEEE_F64LE "x1";
         H5T_IEEE_F64LE "y1";
         H5T_IEEE_F64LE "x2";
         H5T_IEEE_F64LE "y2";
      }
      DATASPACE  SIMPLE { ( 5 ) / ( 5 ) }
      DATA {
      (0): {
            0,
            2,
            1,
            2,
            3
         },
      (1): {
            1,
            2.1,
            1,
            2.3,
            3.1
         },
      (2): {
            2,
            2.4,
            1.4,
            nan,
            nan
         },
      (3): {
            3,
            2.2,
            1.5,
            2.4,
            3.1
         },
      (4): {
            4,
            nan,
            nan,
            2.3,
            3.2
         }
      }
   }
}
}

Of course you don't have to use a compounded data type, I just did as it kind of makes sense in your context.

like image 160
Timothy Brown Avatar answered Nov 03 '25 08:11

Timothy Brown



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!