I have a numpy array data set with shape (100,10). Each row is a one-hot encoding. I want to transfer it into a nd-array with shape (100,) such that I transferred each vector row into a integer that denote the index of the nonzero index. Is there a quick way of doing this using numpy or tensorflow?
One-hot encoding is essentially the representation of categorical variables as binary vectors. These categorical values are first mapped to integer values. Each integer value is then represented as a binary vector that is all 0s (except the index of the integer which is marked as 1).
One hot encoding is one method of converting data to prepare it for an algorithm and get a better prediction. With one-hot, we convert each categorical value into a new categorical column and assign a binary value of 1 or 0 to those columns. Each integer value is represented as a binary vector.
One-hot encoding is the process by which categorical data are converted into numerical data for use in machine learning. Categorical features are turned into binary features that are “one-hot” encoded, meaning that if a feature is represented by that column, it receives a 1 . Otherwise, it receives a 0 .
You can use numpy.argmax or tf.argmax. Example:
import numpy as np a = np.array([[0,1,0,0],[1,0,0,0],[0,0,0,1]]) print('np.argmax(a, axis=1): {0}'.format(np.argmax(a, axis=1)))
output:
np.argmax(a, axis=1): [1 0 3]
You may also want to look at sklearn.preprocessing.LabelBinarizer.inverse_transform
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With