Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

write a boost::multi_array to hdf5 dataset

Are there any libraries or headers available to make writing c++ vectors or boost::multi_arrays to HDF5 datasets easy?

I have looked at the HDF5 C++ examples and they just use c++ syntax to call c functions, and they only write static c arrays to their datasets (see create.cpp).

Am I missing the point!?

Many thanks in advance, Adam

like image 203
AdamC Avatar asked Feb 12 '12 16:02

AdamC


2 Answers

Here is how to write N dimension multi_arrays in HDF5 format

Here is a short example:

#include <boost/multi_array.hpp>
using boost::multi_array;
using boost::extents;


// allocate array
int NX = 5,  NY = 6,  NZ = 7;
multi_array<double, 3>  float_data(extents[NX][NY][NZ]);

// initialise the array
for (int ii = 0; ii != NX; ii++)
    for (int jj = 0; jj != NY; jj++)
        for (int kk = 0; kk != NZ; kk++)
            float_data[ii][jj][kk]  = ii + jj + kk;

// 
// write to HDF5 format
// 
H5::H5File file("SDS.h5", H5F_ACC_TRUNC);
write_hdf5(file, "doubleArray", float_data );

Here is code for write_hdf5().

First, we must map c++ types to HDF5 types (from the H5 c++ api). I have commented out lines which lead to duplicate definitions because some of the <stdint.h> types (e.g. uint8_t) are aliases of standard types (e.g. unsigned char)

#include <cstdint>

//!_______________________________________________________________________________________
//!     
//!     map types to HDF5 types
//!         
//!     
//!     \author lg (04 March 2013)
//!_______________________________________________________________________________________ 

template<typename T> struct get_hdf5_data_type
{   static H5::PredType type()  
    {   
        //static_assert(false, "Unknown HDF5 data type"); 
        return H5::PredType::NATIVE_DOUBLE; 
    }
};
template<> struct get_hdf5_data_type<char>                  {   H5::IntType type    {   H5::PredType::NATIVE_CHAR       };  };
//template<> struct get_hdf5_data_type<unsigned char>       {   H5::IntType type    {   H5::PredType::NATIVE_UCHAR      };  };
//template<> struct get_hdf5_data_type<short>               {   H5::IntType type    {   H5::PredType::NATIVE_SHORT      };  };
//template<> struct get_hdf5_data_type<unsigned short>      {   H5::IntType type    {   H5::PredType::NATIVE_USHORT     };  };
//template<> struct get_hdf5_data_type<int>                 {   H5::IntType type    {   H5::PredType::NATIVE_INT        };  };
//template<> struct get_hdf5_data_type<unsigned int>        {   H5::IntType type    {   H5::PredType::NATIVE_UINT       };  };
//template<> struct get_hdf5_data_type<long>                {   H5::IntType type    {   H5::PredType::NATIVE_LONG       };  };
//template<> struct get_hdf5_data_type<unsigned long>       {   H5::IntType type    {   H5::PredType::NATIVE_ULONG      };  };
template<> struct get_hdf5_data_type<long long>             {   H5::IntType type    {   H5::PredType::NATIVE_LLONG      };  };
template<> struct get_hdf5_data_type<unsigned long long>    {   H5::IntType type    {   H5::PredType::NATIVE_ULLONG     };  };
template<> struct get_hdf5_data_type<int8_t>                {   H5::IntType type    {   H5::PredType::NATIVE_INT8       };  };
template<> struct get_hdf5_data_type<uint8_t>               {   H5::IntType type    {   H5::PredType::NATIVE_UINT8      };  };
template<> struct get_hdf5_data_type<int16_t>               {   H5::IntType type    {   H5::PredType::NATIVE_INT16      };  };
template<> struct get_hdf5_data_type<uint16_t>              {   H5::IntType type    {   H5::PredType::NATIVE_UINT16     };  };
template<> struct get_hdf5_data_type<int32_t>               {   H5::IntType type    {   H5::PredType::NATIVE_INT32      };  };
template<> struct get_hdf5_data_type<uint32_t>              {   H5::IntType type    {   H5::PredType::NATIVE_UINT32     };  };
template<> struct get_hdf5_data_type<int64_t>               {   H5::IntType type    {   H5::PredType::NATIVE_INT64      };  };
template<> struct get_hdf5_data_type<uint64_t>              {   H5::IntType type    {   H5::PredType::NATIVE_UINT64     };  };
template<> struct get_hdf5_data_type<float>                 {   H5::FloatType type  {   H5::PredType::NATIVE_FLOAT      };  };
template<> struct get_hdf5_data_type<double>                {   H5::FloatType type  {   H5::PredType::NATIVE_DOUBLE     };  };
template<> struct get_hdf5_data_type<long double>           {   H5::FloatType type  {   H5::PredType::NATIVE_LDOUBLE    };  };

Then we can use a bit of template forwarding magic to make a function of the right type to output our data. Since this is template code, it needs to live in a header file if you are going to output HDF5 arrays from multiple source files in your programme:

//!_______________________________________________________________________________________
//!     
//!     write_hdf5 multi_array
//!         
//!     \author leo Goodstadt (04 March 2013)
//!     
//!_______________________________________________________________________________________
template<typename T, std::size_t DIMENSIONS, typename hdf5_data_type>
void do_write_hdf5(H5::H5File file, const std::string& data_set_name, const boost::multi_array<T, DIMENSIONS>& data, hdf5_data_type& datatype)
{
    // Little endian for x86
    //FloatType datatype(get_hdf5_data_type<T>::type());
    datatype.setOrder(H5T_ORDER_LE);

    vector<hsize_t> dimensions(data.shape(), data.shape() + DIMENSIONS);
    H5::DataSpace dataspace(DIMENSIONS, dimensions.data());

    H5::DataSet dataset = file.createDataSet(data_set_name, datatype, dataspace);

    dataset.write(data.data(), datatype);
}

template<typename T, std::size_t DIMENSIONS>
void write_hdf5(H5::H5File file, const std::string& data_set_name, const boost::multi_array<T, DIMENSIONS>& data )
{

    get_hdf5_data_type<T> hdf_data_type;
    do_write_hdf5(file, data_set_name, data, hdf_data_type.type);
}
like image 106
Leo Goodstadt Avatar answered Nov 12 '22 04:11

Leo Goodstadt


I am unaware of any. The HDF5 C++ wrappers are not that great, particularly because they don't allow combination with parallel HDF5. So, I wrote my own wrappers in about 2 hours and it works just fine. Ultimately, you'll just have to call it directly (or indirectly if you choose to make C++ bindings).

Fortunately, both the vectors and multi_arrays are contiguous in storage, so you can just pass the data from them directly into HDF5 function calls.

like image 36
tpg2114 Avatar answered Nov 12 '22 02:11

tpg2114