Problem
I'm trying to load a file using PyTorch, but the error states archive/data.pkl
does not exist.
Code
import torch
cachefile = 'cacheddata.pth'
torch.load(cachefile)
Output
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-4-8edf1f27a4bd> in <module>
1 import torch
2 cachefile = 'cacheddata.pth'
----> 3 torch.load(cachefile)
~/opt/anaconda3/envs/matching/lib/python3.8/site-packages/torch/serialization.py in load(f, map_location, pickle_module, **pickle_load_args)
582 opened_file.seek(orig_position)
583 return torch.jit.load(opened_file)
--> 584 return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
585 return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
586
~/opt/anaconda3/envs/matching/lib/python3.8/site-packages/torch/serialization.py in _load(zip_file, map_location, pickle_module, **pickle_load_args)
837
838 # Load the data (which may in turn use `persistent_load` to load tensors)
--> 839 data_file = io.BytesIO(zip_file.get_record('data.pkl'))
840 unpickler = pickle_module.Unpickler(data_file, **pickle_load_args)
841 unpickler.persistent_load = persistent_load
RuntimeError: [enforce fail at inline_container.cc:209] . file not found: archive/data.pkl
Hypothesis
I'm guessing this has something to do with pickle, from the docs:
This save/load process uses the most intuitive syntax and involves the least amount of code. Saving a model in this way will save the entire module using Python’s pickle module. The disadvantage of this approach is that the serialized data is bound to the specific classes and the exact directory structure used when the model is saved. The reason for this is because pickle does not save the model class itself. Rather, it saves a path to the file containing the class, which is used during load time. Because of this, your code can break in various ways when used in other projects or after refactors.
Versions
Turned out the file was somehow corrupted. After generating it again it loaded without issue.
I was facing the same problem. I downloaded directly the model (.pt
) trained with GPU from a notebook on GCP AI Platform. When I loaded it on local by torch.load('models/model.pt', map_location=device)
, I got this error:
RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory`.
I noticed that the size of the downloaded file is much smaller than expected. So same as @Ian, it turned out the file were corrupted when downloading from the notebook. Finally I had to transfer the file from the notebook into a bucket on Google Cloud Storage (GCS) at first instead of downloading it directly, then downloaded the file from GCS. It works now.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With