Convert array of indices to 1-hot encoded numpy array

People also ask

What function do we use to create one hot encoded arrays of the labels?

Specifically, the LabelEncoder of creating an integer encoding of labels and the OneHotEncoder for creating a one hot encoding of integer encoded values.

How do you encode an array in Python?

To encode string array values, use the numpy. char. encode() method in Python Numpy. The arr is the input array to be encoded.

What is a 1 dimensional NumPy array?

One dimensional array contains elements only in one dimension. In other words, the shape of the NumPy array should contain only one value in the tuple.

Your array a defines the columns of the nonzero elements in the output array. You need to also define the rows and then use fancy indexing:

>>> a = np.array([1, 0, 3])
>>> b = np.zeros((a.size, a.max()+1))
>>> b[np.arange(a.size),a] = 1
>>> b
array([[ 0.,  1.,  0.,  0.],
       [ 1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.]])

>>> values = [1, 0, 3]
>>> n_values = np.max(values) + 1
>>> np.eye(n_values)[values]
array([[ 0.,  1.,  0.,  0.],
       [ 1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.]])

In case you are using keras, there is a built in utility for that:

from keras.utils.np_utils import to_categorical   

categorical_labels = to_categorical(int_labels, num_classes=3)

And it does pretty much the same as @YXD's answer (see source-code).

Here is what I find useful:

def one_hot(a, num_classes):
  return np.squeeze(np.eye(num_classes)[a.reshape(-1)])

Here num_classes stands for number of classes you have. So if you have a vector with shape of (10000,) this function transforms it to (10000,C). Note that a is zero-indexed, i.e. one_hot(np.array([0, 1]), 2) will give [[1, 0], [0, 1]].

Exactly what you wanted to have I believe.

PS: the source is Sequence models - deeplearning.ai

You can also use eye function of numpy:

numpy.eye(number of classes)[vector containing the labels]

You can use sklearn.preprocessing.LabelBinarizer:

Example:

import sklearn.preprocessing
a = [1,0,3]
label_binarizer = sklearn.preprocessing.LabelBinarizer()
label_binarizer.fit(range(max(a)+1))
b = label_binarizer.transform(a)
print('{0}'.format(b))

output:

[[0 1 0 0]
 [1 0 0 0]
 [0 0 0 1]]

Amongst other things, you may initialize sklearn.preprocessing.LabelBinarizer() so that the output of transform is sparse.

Related questions
                            
                                In Python, how do I split a string and keep the separators?
                            
                                Update a dataframe in pandas while iterating row by row
                            
                                Best way to save a trained model in PyTorch?
                            
                                How to open a file for both reading and writing?
                            
                                How to set a single, main title above all the subplots with Pyplot?
                            
                                How to check whether a variable is a class or not?
                            
                                Concatenating two lists - difference between '+=' and extend()
                            
                                Using pip behind a proxy with CNTLM
                            
                                How do I update a Python package?
                            
                                Python 3: UnboundLocalError: local variable referenced before assignment [duplicate]
                            
                                Python JSON serialize a Decimal object
                            
                                Python "raise from" usage
                            
                                python: How do I know what type of exception occurred?
                            
                                remove None value from a list without removing the 0 value
                            
                                Inserting image into IPython notebook markdown
                            
                                Access multiple elements of list knowing their index
                            
                                Possibilities for Python classes organized across files? [closed]
                            
                                How to find the installed pandas version
                            
                                How does numpy.newaxis work and when to use it?
                            
                                _csv.Error: field larger than field limit (131072)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Convert array of indices to 1-hot encoded numpy array

Tags:

python

machine-learning

numpy

numpy-ndarray

one-hot-encoding

People also ask

Recent Activity

Donate For Us