I need to create a fixed length Tensor
in pyTorch that acts like a FIFO queue.
I have this fuction to do it:
def push_to_tensor(tensor, x):
tensor[:-1] = tensor[1:]
tensor[-1] = x
return tensor
For example, I have:
tensor = Tensor([1,2,3,4])
>> tensor([ 1., 2., 3., 4.])
then using the function will give:
push_to_tensor(tensor, 5)
>> tensor([ 2., 3., 4., 5.])
However, I was wondering:
I implemented another FIFO queue:
def push_to_tensor_alternative(tensor, x):
return torch.cat((tensor[1:], Tensor([x])))
The functionality is the same, but then I checked their performance in speed:
# Small Tensor
tensor = Tensor([1,2,3,4])
%timeit push_to_tensor(tensor, 5)
>> 30.9 µs ± 1.26 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit push_to_tensor_alternative(tensor, 5)
>> 22.1 µs ± 2.25 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
# Larger Tensor
tensor = torch.arange(10000)
%timeit push_to_tensor(tensor, 5)
>> 57.7 µs ± 4.88 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit push_to_tensor_alternative(tensor, 5)
>> 28.9 µs ± 570 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Seems like this push_to_tensor_alternative
which uses torch.cat
(instead of shifting all items to the left) is faster.
Maybe a little late but I found another way to do this and save some time.
In my case, I needed a similar FIFO structure but I only needed to actually parse the
FIFO tensor once every N iterations. i.e. I needed a FIFO structure to hold n
integers, and every n
iterations I needed to parse that tensor thourgh my model. I found it is way faster to implement a collections.deque
structure and cast that deque to a tensor torch.
import time
import torch
from collections import deque
length = 5000
que = deque([0]*200)
ten = torch.tensor(que)
s = time.time()
for i in range(length):
for j in range(200):
que.pop()
que.appendleft(j*10)
torch.tensor(que)
# after some appending/popping elements, cast to tensor
print("finish deque:", time.time()-s)
s = time.time()
for i in range(length):
for j in range(200):
newelem = torch.tensor([j*10])
ten = torch.cat((ten[1:], newelem))
#using tensor as FIFO, no need to cast to tensor
print("finish tensor:", time.time()-s)
the results are the following:
finish deque: 0.15857529640197754
finish tensor: 9.483643531799316
I also noticed that when using a deque and always casting to a torch.tensor instead
of using push_alternative
it can give you a ~20% boost in time.
s = time.time()
for j in range(length):
que.pop()
que.appendleft(j*10)
torch.tensor(que)
print("finish queue:", time.time()-s)
s = time.time()
for j in range(length):
newelem = torch.tensor([j*10])
ten = torch.cat((ten[1:], newelem))
print("finish tensor:", time.time()-s)
results:
finish queue: 8.422480821609497
finish tensor: 11.169137477874756
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With