Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating one hot vector from indices given as a tensor

Tags:

pytorch

I have a tensor of size 4 x 6 where 4 is batch size and 6 is sequence length. Every element of the sequence vectors are some index (0 to n). I want to create a 4 x 6 x n tensor where the vectors in 3rd dimension will be one hot encoding of the index which means I want to put 1 in the specified index and rest of the values will be zero.

For example, I have the following tensor:

[[5, 3, 2, 11, 15, 15],
[1, 4, 6, 7, 3, 3],
[2, 4, 7, 8, 9, 10],
[11, 12, 15, 2, 5, 7]]

Here, all the values are in between (0 to n) where n = 15. So, I want to convert the tensor to a 4 X 6 X 16 tensor where the third dimension will represent one hot encoding vector.

How can I do that using PyTorch functionalities? Right now, I am doing this with loop but I want to avoid looping!

like image 803
Wasi Ahmad Avatar asked Jun 09 '17 15:06

Wasi Ahmad


People also ask

How do you convert a tensor to one hot?

One hot tensor is a Tensor in which all the values at indices where i =j and i!= j is same. Method Used: one_hot: This method accepts a Tensor of indices, a scalar defining depth of the one hot dimension and returns a one hot Tensor with default on value 1 and off value 0.

How do you make one hot encoding in PyTorch?

Creating PyTorch one-hot encoding In the above example, we try to implement the one hot() encoding function as shown here first; we import all required packages, such as a torch. After that, we created a tenor with different values, and finally, we applied the one hot() function as shown.

How do you make a torch tensor?

There are a few main ways to create a tensor, depending on your use case. To create a tensor with pre-existing data, use torch. tensor() . To create a tensor with specific size, use torch.

What is an index tensor?

Tensors are generalizations of scalars (that have no indices), vectors (that have exactly one index), and matrices (that have exactly two indices) to an arbitrary number of indices.


1 Answers

NEW ANSWER As of PyTorch 1.1, there is a one_hot function in torch.nn.functional. Given any tensor of indices indices and a maximal index n, you can create a one_hot version as follows:

n = 5
indices = torch.randint(0,n, size=(4,7))
one_hot = torch.nn.functional.one_hot(indices, n) # size=(4,7,n)

Very old Answer

At the moment, slicing and indexing can be a bit of a pain in PyTorch from my experience. I assume you don't want to convert your tensors to numpy arrays. The most elegant way I can think of at the moment is to use sparse tensors and then convert to a dense tensor. That would work as follows:

from torch.sparse import FloatTensor as STensor

batch_size = 4
seq_length = 6
feat_dim = 16

batch_idx = torch.LongTensor([i for i in range(batch_size) for s in range(seq_length)])
seq_idx = torch.LongTensor(list(range(seq_length))*batch_size)
feat_idx = torch.LongTensor([[5, 3, 2, 11, 15, 15], [1, 4, 6, 7, 3, 3],                            
                             [2, 4, 7, 8, 9, 10], [11, 12, 15, 2, 5, 7]]).view(24,)

my_stack = torch.stack([batch_idx, seq_idx, feat_idx]) # indices must be nDim * nEntries
my_final_array = STensor(my_stack, torch.ones(batch_size * seq_length), 
                         torch.Size([batch_size, seq_length, feat_dim])).to_dense()    

print(my_final_array)

Note: PyTorch is undergoing some work currently, that will add numpy style broadcasting and other functionalities within the next two or three weeks and other functionalities. So it's possible, there'll be better solutions available in the near future.

Hope this helps you a bit.

like image 154
mbpaulus Avatar answered Nov 10 '22 00:11

mbpaulus