Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Issues using Keras np_utils.to_categorical

Tags:

python

keras

I'm trying to make an array of one-hot vector of integers into an array of one-hot vector that keras will be able to use to fit my model. Here's the relevant part of the code:

Y_train = np.hstack(np.asarray(dataframe.output_vector)).reshape(len(dataframe),len(output_cols))
dummy_y = np_utils.to_categorical(Y_train)

Below is an image showing what Y_train and dummy_y actually are.

I couldn't find any documentation for to_categorical that could help me.

Thanks in advance.

like image 201
Eduardo Avatar asked Jan 05 '17 21:01

Eduardo


2 Answers

np_utils.to_categorical is used to convert array of labeled data(from 0 to nb_classes - 1) to one-hot vector.

The official doc with an example.

In [1]: from keras.utils import np_utils # from keras import utils as np_utils
Using Theano backend.

In [2]: np_utils.to_categorical?
Signature: np_utils.to_categorical(y, num_classes=None)
Docstring:
Convert class vector (integers from 0 to nb_classes) to binary class matrix, for use with categorical_crossentropy.

# Arguments
    y: class vector to be converted into a matrix
    nb_classes: total number of classes

# Returns
    A binary matrix representation of the input.
File:      /usr/local/lib/python3.5/dist-packages/keras/utils/np_utils.py
Type:      function

In [3]: y_train = [1, 0, 3, 4, 5, 0, 2, 1]

In [4]: """ Assuming the labeled dataset has total six classes (0 to 5), y_train is the true label array """

In [5]: np_utils.to_categorical(y_train, num_classes=6)
Out[5]:
array([[ 0.,  1.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  1.],
       [ 1.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.,  0.]])
like image 86
yardstick17 Avatar answered Oct 24 '22 00:10

yardstick17


from keras.utils.np_utils import to_categorical

UPDATED --- keras.utils.np_utils doesn't work in newer versions; if so use:

from tensorflow.keras.utils import to_categorical

In both cases

to_categorical(0, max_value_of_array)

It assumes the class values were in string and you will be label encoding them, hence starting everytime from 0 to n-classes.

for the same example:- consider an array of {1,2,3,4,2}

The output will be [zero value, one value, two value, three value, four value]

array([[ 0.,  1.,  0., 0., 0.],
       [ 0.,  0.,  1., 0., 0.],
       [ 0.,  0.,  0., 1., 0.],
       [ 0.,  0.,  0., 0., 1.],
       [ 0.,  0.,  1., 0., 0.]],

Let's look at another example:-

Again, for an array having 3 classes, Y = {4, 8, 9, 4, 9}

to_categorical(Y) will output

array([[0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0. ],
       [0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0. ],
       [0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1. ],
       [0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0. ],
       [0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1. ]]
like image 37
Pranzell Avatar answered Oct 24 '22 02:10

Pranzell