Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error opening file in H5PY (File signature not found)

I've been using the following bit of code to open some HDF5 files, produced in MATLAB, in python using H5PY:

import h5py as h5
data='dataset.mat'
f=h5.File(data, 'r')

However I'm getting the following error:

OSError: Unable to open file (File signature not found)

I've checked that the files that I'm trying to open are version 7.3 MAT-files and are HDF5 format. In fact I've used H5PY to open the same files successfully before. I've confirmed that the files exist and are accessible so I'm not really sure where the error is coming from. Any advice would be greatly appreciated, thanks in advance : )

like image 432
Anisha Singh Avatar asked Jun 29 '16 03:06

Anisha Singh


People also ask

How do I open an HDF5 file?

Open a HDF5/H5 file in HDFView hdf5 file on your computer. Open this file in HDFView. If you click on the name of the HDF5 file in the left hand window of HDFView, you can view metadata for the file.

What is a h5py file?

The h5py package is a Pythonic interface to the HDF5 binary data format. HDF5 lets you store huge amounts of numerical data, and easily manipulate that data from NumPy. For example, you can slice into multi-terabyte datasets stored on disk, as if they were real NumPy arrays.

How do I open H5 files in Jupyter notebook?

Double clicking on an . hdf5 file in the file browser will open it in a special HDF browser. You can then browse through the groups and open the datasets in the . hdf5 file.


3 Answers

Usually the message File signature not found indicates either:

1. Your file is corrupted.

... is what I think is most likely. You said you've opened the files before. Maybe you forgot closing your file-handle which can corrupt the file. Try checking the file with the HDF5 utility h5debug (available on command line if you've installed the hdf5 lib on your OS, check with dpkg -s libhdf5-dev on Linux).

2. The file is not in HDF5 format.

This is a known cause for your error message. But since you said you made sure, that this is the case and you've opened the files before, I'm giving this just for reference for others that may stumble here:

Since December 2015 (as of version 7.3), Matlab files use the HDF5 based format in their MAT-File Level 5 Containers (more doc). Earlier version MAT-files (v4 (Level 1.0), v6 and v7 to 7.2) are supported by and can be read with the scipy library:

import scipy.io
f = scipy.io.loadmat('dataset.mat')

Otherwise you may try other methods and see whether the error persists:

PyTables is an alternative to h5py and be found here.

import tables
file = tables.open_file('test.mat')

Install using

pip install tables

Python MATLAB Engine is an alternative to read MAT files, if you have matlab installed. Documentation is found here: MATLAB Engine API for Python.

import matlab.engine
mat = matlab.engine.start_matlab()
f = mat.load("dataset.mat", nargout=1)
like image 68
Honeybear Avatar answered Oct 16 '22 19:10

Honeybear


I was facing the same issue with my .h5 file. And the problem was that I was not downloading the .h5 file correctly.

I was doing filename.h5->right_click->save link as, which was not downloading the file correctly(or may be the file was getting corrupted). Instead of doing that I downloaded the file as : selected the checkbox with filename.h5 and clicked on download and after that my code worked.

May be this help the one's who are doing the same mistake.

like image 36
Rajat Avatar answered Oct 16 '22 18:10

Rajat


Usually this happens when files are corrupted. I faced this problem and downloaded the file again and it resolves the issues.

like image 1
Dharmendra Singh Avatar answered Oct 16 '22 19:10

Dharmendra Singh