Best way to save many tensors of different shapes?

Tags:

I would like to store thousands to millions of tensors with different shapes to disk. The goal is to use them as a time series dataset. The dataset will probably not fit into memory and I will have to load samples or ranges of samples from disk.

What is the best way to accomplish this while keeping storage and access time low?

733

asked Jul 16 '20 10:07

TomTom

1 Answers

The easiest way to save anything in disk is by using pickle:

import pickle
import torch

a = torch.rand(3,4,5)

# save
with open('filename.pickle', 'wb') as handle:
    pickle.dump(a, handle)

# open
with open('filename.pickle', 'rb') as handle:
    b = pickle.load(handle)

You can also save things with pytorch directly, but that is just a pytorch wrapper around pikle.

import torch
x = torch.tensor([0, 1, 2, 3, 4])
torch.save(x, 'tensor.pt')

If you want to save multiple tensors in one file, you can wrap them in a dictionary:

import torch
x = torch.tensor([0, 1, 2, 3, 4])
a = torch.rand(2,3,4,5)
b = torch.zeros(37)
torch.save({"a": a, "b":b, "x", x}, 'tensors.pt')

answered Oct 22 '22 16:10

Victor Zuanazzi

Related questions
                            
                                Java unzip compressed archive with folders FileNotFound exception
                            
                                Starting FPGA Programming [closed]
                            
                                java.util.zip - ZipInputStream v.s. ZipFile
                            
                                Simple interpreter written in Haskell, saves up print output until the end, instead of when it comes across a print statement
                            
                                IO.FileNotFoundException but File should exist
                            
                                Saving a numpy array with mixed data
                            
                                Determining buffer size when working with files in C#? [duplicate]
                            
                                Best way to portably assign the result of fgetc() to a char in C
                            
                                Fastest output to file in c and c++
                            
                                How to check if a String Path is a 'File' or 'Directory' if path doesn't exist?
                            
                                Why is it that FileInputStream read is slower with bigger array
                            
                                Perl: Pass one byte plus STDIN to another command
                            
                                Prevent string being printed python
                            
                                Java AsynchronousFileChannel - thread usage
                            
                                Weaken GADTs type constraints to deal with unpredictable data
                            
                                file_put_contents saying permission denied? [duplicate]
                            
                                How to execute sql file from java
                            
                                How to open file in exclusive mode in C++
                            
                                fread() a struct in c
                            
                                What does #tty? on STDIN mean / do in ruby?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Best way to save many tensors of different shapes?

Tags:

io

pytorch

TomTom

People also ask

1 Answers

Victor Zuanazzi

Recent Activity

Donate For Us