Binary numbers instead of one hot vectors

Tags:

While doing logistic regression, it is common practice to use one hot vectors as desired result. So, no of classes = no of nodes in output layer. We don't use index of word in vocabulary(or a class number in general) because that may falsely indicate closeness of two classes. But why can't we use binary numbers instead of one-hot vectors?

i.e if there are 4 classes, we can represent each class as 00,01,10,11 resulting in log(no of classes) nodes in output layer.

900

asked Oct 23 '16 20:10

Raghuram Vadapalli

1 Answers

It is fine if you encode with binary. But you probably need to add another layer (or a filter) depending on your task and model. Because your encoding now implicates invalid shared features due to the binary representation.

For example, a binary encoding for input (x = [x1, x2]):

'apple' = [0, 0]
'orange' = [0, 1]
'table' = [1, 0]
'chair' = [1, 1]

It means that orange and chair share same feature x2. Now with predictions for two classes y:

'fruit' = 0
'furniture' = 1

And linear optimization model (W = [w1, w2] and bias b) for labeled data sample:

(argmin W) Loss = y - (w1 * x1 + w2 * x2 + b)

Whenever you update w2 weights for chair as furniture you get an undesirable update as if predicting orange as furniture as well.

In this particular case, if you add another layer U = [u1, u2], you can probably solve this issue:

(argmin U,W) Loss = y - (u1 * (w1 * x1 + w2 * x2 + b) +
                         u2 * (w1 * x1 + w2 * x2 + b) +
                         b2)

Ok, why not avoid this miss representation, by using one-hot encoding. :)

132

answered Nov 24 '22 05:11

Mehdi

Related questions
                            
                                Ground Truth and training data set
                            
                                Scikit Learn - Calculating TF-IDF from a corpus of arrays of features instead of from a corpus of raw documents
                            
                                Trouble understanding Convolutional Neural Network
                            
                                How to update an SVM model with new data
                            
                                Why xgboost.cv and sklearn.cross_val_score give different results?
                            
                                What is row slicing vs What is column slicing?
                            
                                How to list all classification/regression/clustering algorithms in scikit-learn?
                            
                                Keras Realtime Augmentation adding Noise and Contrast
                            
                                How to calculate the actual size of a .fit()-trained model in sklearn?
                            
                                How to visualize TensorFlow Estimator weights?
                            
                                How to do multi-class image classification in keras?
                            
                                Using sample_weights with fit_generator()
                            
                                AttributeError: 'int' object has no attribute 'lower' in TFIDF and CountVectorizer
                            
                                MemoryError: Unable to allocate MiB for an array with shape and data type, when using anymodel.fit() in sklearn
                            
                                scikit-learn GMM produce positive log probability
                            
                                C++ accumulator library with ability to remove old samples
                            
                                how to transform a text to vector
                            
                                Active Learning (e.g. Pool Sampling) for SVM in python [closed]
                            
                                Re-initialize variables in Tensorflow
                            
                                Ridge regression with `glmnet` gives different coefficients than what I compute by "textbook definition"?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Binary numbers instead of one hot vectors

Tags:

machine-learning

neural-network

nlp

computer-vision

Raghuram Vadapalli

People also ask

1 Answers

Mehdi

Recent Activity

Donate For Us