Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Torch: How to shuffle a tensor by its rows?

I am currently working in torch to implement a random shuffle (on the rows, the first dimension in this case) on some input data. I am new to torch, so I have some troubles figuring out how permutation works..

The following is supposed to shuffle the data:

if argshuffle then 
    local perm = torch.randperm(sids:size(1)):long()
    print("\n\n\nSize of X and y before")
    print(X:view(-1, 1000, 128):size())
    print(y:size())
    print(sids:size())
    print("\nPerm size is: ")
    print(perm:size())
    X = X:view(-1, 1000, 128)[{{perm},{},{}}]
    y = y[{{perm},{}}]
    print(sids[{{1}, {}}])
    sids = sids[{{perm},{}}]
    print(sids[{{1}, {}}])
    print(X:size())
    print(y:size())
    print(sids:size())
    os.exit(69)
end

This prints out

Size of X and y before 
99 
1000
128
[torch.LongStorage of size 3]

99 
1
[torch.LongStorage of size 2]

99 
1
[torch.LongStorage of size 2]

Perm size is: 
99 
[torch.LongStorage of size 1]
5
[torch.LongStorage of size 1x1]
5
[torch.LongStorage of size 1x1]


99 
1000
128
[torch.LongStorage of size 3]

99 
1
[torch.LongStorage of size 2]

99 
1
[torch.LongStorage of size 2]

Out of the value, I can imply that the function did not shuffle the data. How can I make it shuffle correctly, and what is the common solution in lua/torch?

like image 596
DaveTheAl Avatar asked Jun 24 '17 16:06

DaveTheAl


People also ask

How do you shuffle the tensor in PyTorch?

A matrix in PyTorch is a 2-dimension tensor having elements of the same dtype. We can shuffle a row by another row and a column by another column. To shuffle rows or columns, we can use simple slicing and indexing as we do in Numpy. If we want to shuffle rows, then we do slicing in the row indices.

What is Torch tensor ()?

A torch.Tensor is a multi-dimensional matrix containing elements of a single data type.


2 Answers

I also faced a similar issue. In the documentation, there is no shuffle function for tensors (there are for dataset loaders). I found a workaround to the problem using torch.randperm.

>>> a=torch.rand(3,5)
>>> print(a)
tensor([[0.4896, 0.3708, 0.2183, 0.8157, 0.7861],
        [0.0845, 0.7596, 0.5231, 0.4861, 0.9237],
        [0.4496, 0.5980, 0.7473, 0.2005, 0.8990]])
>>> # Row shuffling
... 
>>> a=a[torch.randperm(a.size()[0])]
>>> print(a)
tensor([[0.4496, 0.5980, 0.7473, 0.2005, 0.8990],
        [0.0845, 0.7596, 0.5231, 0.4861, 0.9237],
        [0.4896, 0.3708, 0.2183, 0.8157, 0.7861]])
>>> # column shuffling
... 
>>> a=a[:,torch.randperm(a.size()[1])]
>>> print(a)
tensor([[0.2005, 0.7473, 0.5980, 0.8990, 0.4496],
        [0.4861, 0.5231, 0.7596, 0.9237, 0.0845],
        [0.8157, 0.2183, 0.3708, 0.7861, 0.4896]])

I hope it answers the question!

like image 158
11t Avatar answered Nov 13 '22 13:11

11t


dim = 0
idx = torch.randperm(t.shape[dim])

t_shuffled = t[idx]

If your tensor is e.g. of shape CxNxF (channels by rows by features), then you can shuffle along the second dimension like so:

dim=1
idx = torch.randperm(t.shape[dim])

t_shuffled = t[:,idx]
like image 26
iacob Avatar answered Nov 13 '22 12:11

iacob