Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write 2D std vector of floats to HDF5 file and then read it in python

I want to write a 2D vector of floats to a HDF5 file. I used the following code (writeh5.cpp):

#include <cstdlib> 
#include <ctime> 
#include <iostream>
#include <string>
#include <vector>
#include <iterator>
#include <H5Cpp.h>

using namespace H5;
using namespace std;

int main(void) {
  int nrow = 5;
  int ncol = 4;

  vector<vector< double > > vec2d;
  vec2d.resize(nrow, vector<double>(ncol, 0.0));

  srand((unsigned)time(0));

  typename vector< vector< double > >::iterator row;
  typename vector< double >::iterator col;
  for (row = vec2d.begin(); row != vec2d.end(); row++) {
    cout << endl;
    for (col = row->begin(); col != row->end(); col++) {

      *col = (rand()/(RAND_MAX+1.0));
      cout << *col << '\t';
    }
  }
  cout << endl;

  H5File file("test.h5", H5F_ACC_TRUNC);

  // dataset dimensions
  hsize_t dimsf[2];
  dimsf[0] = nrow;
  dimsf[1] = ncol;
  DataSpace dataspace(2, dimsf);

  DataType datatype(H5::PredType::NATIVE_DOUBLE);
  DataSet dataset = file.createDataSet("data", datatype, dataspace);

  // dataset.write(vec2d.data(), H5::PredType::NATIVE_DOUBLE);
  dataset.write(&vec2d[0][0], H5::PredType::NATIVE_DOUBLE);

  cout << endl << " vec2d has " << endl;
  for (row = vec2d.begin(); row != vec2d.end(); row++) {
      cout << endl;
      for (col = row->begin(); col != row->end(); col++) {            

        cout << *col << '\t';
      }
  }
  cout << endl;

  dataset.close();
  dataspace.close();
  file.close();

  return 0;
}

I compiled it using g++ writeh5.cpp -I/usr/include/hdf5/ -lhdf5_cpp -lhdf5 -Wall

A run of the code produced the following output:

0.325553        0.598941        0.364489        0.0125061
0.374205        0.0319419       0.380329        0.815621
0.863754        0.386279        0.0173515       0.15448
0.703936        0.372486        0.728436        0.991631
0.666207        0.568983        0.807475        0.964276

And the file test.h5

Then when i read this file from python (using the following)

import h5py
import numpy as np

file = h5py.File("test.h5", 'r')
dataset = np.array(file["data"])

print dataset

file.close()

I got

 [[  3.25553381e-001   5.98941262e-001   3.64488814e-001   1.25061036e-002]
 [  0.00000000e+000   2.42092166e-322   3.74204732e-001   3.19418786e-002]
 [  3.80329057e-001   8.15620518e-001   0.00000000e+000   2.42092166e-322]
 [  8.63753530e-001   3.86278684e-001   1.73514970e-002   1.54479635e-001]
 [  0.00000000e+000   2.42092166e-322   7.03935940e-001   3.72486182e-001]]

the first row is good, the other rows are garbage.

I tried with dataset.write(&vec2d[0]... and dataset.write(vec2d[0].data()..., i got similar problems.

I want to

  1. Write a HDF5 file with the contents of a 2D std::vector of doubles,
  2. Read the file in python and store the contents in a numpy array

What i am doing wrong?

like image 547
Caos21 Avatar asked Sep 08 '15 00:09

Caos21


People also ask

How do I open an hd5 File in Python?

To use HDF5, numpy needs to be imported. One important feature is that it can attach metaset to every data in the file thus provides powerful searching and accessing. Let's get started with installing HDF5 to the computer. As HDF5 works on numpy, we would need numpy installed in our machine too.

How do I view a .H5 File?

Within the HDFView application, select File --> Open and navigate to the folder where you saved the NEONDSTowerTemperatureData. hdf5 file on your computer. Open this file in HDFView. If you click on the name of the HDF5 file in the left hand window of HDFView, you can view metadata for the file.


2 Answers

Apparently, I am not allowed to pass a std::vector of vectors to the write function. Thus, copying the elements of the vector to an static array solves the problem, because the write function accepts happily this array.

However, I am not happy with this solution, I expected to use the vectors directly into the write function.

Here is the code:

#include <cstdlib> 
#include <ctime> 
#include <iostream>
#include <string>
#include <vector>
#include <iterator>
#include <H5Cpp.h>

using namespace H5;
using namespace std;

int main(void) {
  int nrow = 5;
  int ncol = 4;

  vector<vector< double > > vec2d;
  vec2d.resize(nrow, vector<double>(ncol, 0.0));

  srand((unsigned)time(0));

  // generate some data
  typename vector< vector< double > >::iterator row;
  typename vector< double >::iterator col;
  for (row = vec2d.begin(); row != vec2d.end(); row++) {
    cout << endl;
    for (col = row->begin(); col != row->end(); col++) {            
        *col = (rand()/(RAND_MAX+1.0));
        cout << *col << '\t';
    }
  }
  cout << endl;

  double varray[nrow][ncol];
  for( int i = 0; i<nrow; ++i) {
    cout << endl;
    for( int j = 0; j<ncol; ++j) {
        varray[i][j] = vec2d[i][j];
    }
  }

  H5File file("test.h5", H5F_ACC_TRUNC);

  // dataset dimensions
  hsize_t dimsf[2];
  dimsf[0] = nrow;
  dimsf[1] = ncol;
  DataSpace dataspace(2, dimsf);

  DataType datatype(H5::PredType::NATIVE_DOUBLE);
  DataSet dataset = file.createDataSet("data", datatype, dataspace);

  dataset.write(varray, H5::PredType::NATIVE_DOUBLE);


  cout << endl;

 dataset.close();
 dataspace.close();
 file.close();
 return 0;
}
like image 101
Caos21 Avatar answered Oct 04 '22 03:10

Caos21


I ran into the same problem when i converted my data from a vector to a dynamic 2D array. The problem with the h5write command is not that it will not accept a vector, It does not understand the concept of a pointer array. it only writes out contiguous memory. A vector of vectors is not contiguous in memory but instead a pointer array to a bunch of vectors. That is why when you passed the first element of the array the first row was correct. The rest of the table is just the garbage in memory following the first vector.

My solution was creating a giant 1D vector and performing my own indexing to convert back and forth. This is similar to the approach in h5_writedyn.c https://www.hdfgroup.org/ftp/HDF5/examples/misc-examples/h5_writedyn.c

like image 37
crazywill32 Avatar answered Oct 04 '22 02:10

crazywill32