Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: h5py gives OSError: Can't read data (inflate() failed) even though it's opened it before

Python 3.5. I have a few hundred .mat mat files (version 7.3) in a directory. I am looping through all of them to extract two different parts of data. I loop through and get and get the first lot with no problems at all but when I do the exact same thing again only I extract a different part of the data I get the following error:

Traceback (most recent call last):
  File "v73_test.py", line 43, in <module>
    mrfs_data = extract.convert1simProteinComCountsIntoDataFrame(path2mats)
  File "/home/oli/Downloads/PhD/wc/mg/version_73_stuff/functions_for_joshuas_matFiles/extract_matFile_data_v73.py", line 586, in convert1simProteinComCountsIntoDataFrame
    raw_data = getMatureProteinComplexs(path2mats, state_no)
  File "/home/oli/Downloads/PhD/wc/mg/version_73_stuff/functions_for_joshuas_matFiles/extract_matFile_data_v73.py", line 53, in getMatureProteinComplexs
    if len(np.array(state_file['ProteinComplex']['counts']).shape) == 3:
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/oli/virtualenvs/standard_python3.5/lib/python3.5/site-packages/h5py/_hl/dataset.py", line 696, in __array__
    self.read_direct(arr)
  File "/home/oli/virtualenvs/standard_python3.5/lib/python3.5/site-packages/h5py/_hl/dataset.py", line 657, in read_direct
    self.id.read(mspace, fspace, dest, dxpl=self._dxpl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5d.pyx", line 181, in h5py.h5d.DatasetID.read
  File "h5py/_proxy.pyx", line 130, in h5py._proxy.dset_rw
  File "h5py/_proxy.pyx", line 84, in h5py._proxy.H5PY_H5Dread
OSError: Can't read data (inflate() failed)

So the file is definitely there and accessible so the only thing I can think of is the data is corrupted but if that is the case surely I wouldn't have been able to extract any data from it all?

like image 982
ojunk Avatar asked Jan 04 '23 06:01

ojunk


1 Answers

I answered my own question here because there is isn't much on the net about this error and I learnt something so maybe it will help someone else.

So I have realised that the data is in fact corrupted. I thought that if a file was corrupted then you would be able to extract nothing from it but it turns out that in this case this is not true and the only bit you can't access is the specific bit that is corrupted. This was not what I expected based on passed experience with other versions of .mat files but now I think about what this version actually is it seems kind of obvious.

like image 79
ojunk Avatar answered Jan 31 '23 19:01

ojunk