I am just picking up HDF5 and I am a bit confused about the difference between creating data for the memory and creating data for the file. What's the difference?
In this example, creating a compound type data requires the data to be created in memory and placed in the file:
/*
* Create the memory data type.
*/
s1_tid = H5Tcreate (H5T_COMPOUND, sizeof(s1_t));
H5Tinsert(s1_tid, "a_name", HOFFSET(s1_t, a), H5T_NATIVE_INT);
H5Tinsert(s1_tid, "c_name", HOFFSET(s1_t, c), H5T_NATIVE_DOUBLE);
H5Tinsert(s1_tid, "b_name", HOFFSET(s1_t, b), H5T_NATIVE_FLOAT);
/*
* Create the dataset.
*/
dataset = H5Dcreate(file, DATASETNAME, s1_tid, space, H5P_DEFAULT);
/*
* Wtite data to the dataset;
*/
status = H5Dwrite(dataset, s1_tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, s1);
However, in another example here, the author also creates a compound data for the file, which specifies a different data type. For example, in creating the data type for memory, serial_no used type H5T_NATIVE_INT, but in creating the datatype for the file, serial_no used H5T_STD_I64BE. Why does he do this?
/*
* Create the compound datatype for memory.
*/
memtype = H5Tcreate (H5T_COMPOUND, sizeof (sensor_t));
status = H5Tinsert (memtype, "Serial number",
HOFFSET (sensor_t, serial_no), H5T_NATIVE_INT);
status = H5Tinsert (memtype, "Location", HOFFSET (sensor_t, location),
strtype);
status = H5Tinsert (memtype, "Temperature (F)",
HOFFSET (sensor_t, temperature), H5T_NATIVE_DOUBLE);
status = H5Tinsert (memtype, "Pressure (inHg)",
HOFFSET (sensor_t, pressure), H5T_NATIVE_DOUBLE);
/*
* Create the compound datatype for the file. Because the standard
* types we are using for the file may have different sizes than
* the corresponding native types, we must manually calculate the
* offset of each member.
*/
filetype = H5Tcreate (H5T_COMPOUND, 8 + sizeof (hvl_t) + 8 + 8);
status = H5Tinsert (filetype, "Serial number", 0, H5T_STD_I64BE);
status = H5Tinsert (filetype, "Location", 8, strtype);
status = H5Tinsert (filetype, "Temperature (F)", 8 + sizeof (hvl_t),
H5T_IEEE_F64BE);
status = H5Tinsert (filetype, "Pressure (inHg)", 8 + sizeof (hvl_t) + 8,
H5T_IEEE_F64BE);
/*
* Create dataspace. Setting maximum size to NULL sets the maximum
* size to be the current size.
*/
space = H5Screate_simple (1, dims, NULL);
/*
* Create the dataset and write the compound data to it.
*/
dset = H5Dcreate (file, DATASET, filetype, space, H5P_DEFAULT, H5P_DEFAULT,
H5P_DEFAULT);
status = H5Dwrite (dset, memtype, H5S_ALL, H5S_ALL, H5P_DEFAULT, wdata);
What is the difference between these two methods?
From http://www.hdfgroup.org/HDF5/doc/UG/UG_frame11Datatypes.html:
H5T_NATIVE_INT corresponds to a C int type. On an Intel based PC, this type is the same as H5T_STD_I32LE, while on a MIPS system this would be equivalent to H5T_STD_I32BE.
That's say, H5T_NATIVE_INT has different memory layout on different type of processors. If your data is only used in memory, which means your data will not go out of this machine, you may like to use H5T_NATIVE_INT for better performance.
But if your data will be saved to file, and will be used by different systems, you must specify a certain int type to keep your data can be read correctly, e.g. H5T_STD_I64BE or H5T_STD_I32LE. If you use H5T_NATIVE_INT, and your created a data file on Intel based PC, the number will be saved as H5T_STD_I32LE. When this file is used by a MIPS system, it will read the number as H5T_STD_I32BE, which is not expected.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With