Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding structure to HDF5 files - Equivalent of NetCDF "Conventions" for HDF5

Tags:

hdf5

netcdf

NetCDF4 has the Conventions convention for adding structure to NetCDFs. I'm looking for the analogous thing but for HDF5 specifically.

My general aim is to add structure to my HDF5 files in a standard way. I want to something like what HDF5 does with images to define a type, using attributes on groups and datasets ~like:

CLASS: IMAGE
IMAGE_VERSION: 1.2
IMAGE_SUBCLASS: IMAGE_TRUECOLOR
...

But as far as I can tell that images specification is stand alone. Maybe I should just reuse the NetCDF "conventions"?

Update:

I'm aware NetCDF4 is implemented on top of HDF5. In this case, we have data from turbulence simulations and experiments not geo data. This data is usually limited to <= 4D. We use HDF5 for storing this data already, but we have no developed standards. Pseudo standard formats have just sort developed organically within the organization.

like image 386
spinkus Avatar asked Apr 21 '16 11:04

spinkus


1 Answers

NetCDF4 files are actually stored using the HDF5 format (http://www.unidata.ucar.edu/publications/factsheets/current/factsheet_netcdf.pdf), however they use netCDF4 conventions for attributes, dimensions, etc. Files are self-describing which is a big plus. HDF5 without netCDF4 allows for much more liberty in defining your data. Is there a specific reason that you would like to use HDF5 instead of netCDF4 ?

I would say that if you don't have any specific constraints (like a model or visualisation software that bugs on netCDF4 files) that you'd be better off using netCDF. netCDF4 can be used by NCO/CDO operators, ncl (ncl also accepts HDF5), idl, the netCDF4 python module, ferret, etc. Personally, I find netCDF4 to be very convenient for storing climate or meteorological data. There's a lot of operators already written for it and you don't have to go through the trouble of developing a standard for your own data - it's already done for you. CMOR (http://cmip-pcmdi.llnl.gov/cmip5/output_req.html) can be used to write CF compliant climate data. It was used for the most recent climate model comparison project.

On the other hand, HDF5 might be worth it if you have another type of data and you are looking for some very specific functionalities for which you need a more customised file format. Would you mind specifying your needs a little better in the comments ?

Update :

Unfortunately, the standards for variable and field names are a little less clear and well-organised for HDF5 files than netCDF since this was the format of choice for big climate modelling projects like CMIP or CORDEX. The problem essentially melts down to using EOSDIS or CF conventions, but finding currently maintained librairies that implement these standards for HDF5 files and have clear documentation isn't exactly easy (if it was you probably wouldn't have posed the question).

If you really just want a standard, NASA explains all the different possible metadata standards in painful detail here : http://gcmd.nasa.gov/add/standards/index.html.

For information, HDF-EOS and HDF5 aren't exactly the same format (HDF-EOS already contains cartography data and is standardised for earth science data), so I don't know if this format would be too restrictive for you. The tools for working with this format are described here: http://hdfeos.net/software/tool.php and summarized here http://hdfeos.org/help/reference/HTIC_Brochure_Examples.pdf.

If you still prefer to use HDF5, your best bet would probably be to download an HDF5 formatted file from NASA for similar data and use it as a basis to create your own tools in the langage of your choice. Here's a list of comprehensive examples using HDF5, HDF4 and HDF-EOS formats with scripts for data treatment and visualisation in Python, MATLAB, IDL and NCL : http://hdfeos.net/zoo/index_openLAADS_Examples.php#MODIS

Essentially the problem is that NASA makes tools available so that you can work with their data, but not necessarily so you can re-create similarily structured data in your own lab setting.

Here's some more specs/infomation about hdf5 for earth science data from NASA : MERRA product https://gmao.gsfc.nasa.gov/products/documents/MERRA_File_Specification.pdf GrADS compatible HDF5 information http://disc.sci.gsfc.nasa.gov/recipes/?q=recipes/How-to-Read-Data-in-HDF-5-Format-with-GrADS HDF data manipulation tools on NASA's Atmospheric Science Data Center : https://eosweb.larc.nasa.gov/HBDOCS/hdf_data_manipulation.html

Hope this helps a little.

like image 110
SpicyBaguette Avatar answered Oct 10 '22 05:10

SpicyBaguette