Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

From list of indices to one-hot matrix

What is the best (elegant and efficient) way in Theano to convert a vector of indices to a matrix of zeros and ones, in which every row is the one-of-N representation of an index?

v = t.ivector()  # the vector of indices
n = t.scalar()   # the width of the matrix
convert = <your code here>
f = theano.function(inputs=[v, n], outputs=convert)

Example:

n_val = 4
v_val = [1,0,3]
f(v_val, n_val) = [[0,1,0,0],[1,0,0,0],[0,0,0,1]]
like image 916
John Jaques Avatar asked Mar 19 '23 07:03

John Jaques


2 Answers

I didn't compare the different option, but you can also do it like this. It don't request extra memory.

import numpy as np
import theano

n_val = 4
v_val = np.asarray([1,0,3])
idx = theano.tensor.lvector()
z = theano.tensor.zeros((idx.shape[0], n_val))
one_hot = theano.tensor.set_subtensor(z[theano.tensor.arange(idx.shape[0]), idx], 1)
f = theano.function([idx], one_hot)
print f(v_val)[[ 0.  1.  0.  0.]
 [ 1.  0.  0.  0.]
 [ 0.  0.  0.  1.]]
like image 162
nouiz Avatar answered Mar 28 '23 21:03

nouiz


It's as simple as:

convert = t.eye(n,n)[v]

There still might be a more efficient solution that doesn't require building the whole identity matrix. This might be problematic for large n and short v's.

like image 36
John Jaques Avatar answered Mar 28 '23 22:03

John Jaques