I'm following this tutorial (section 6: Tying it All Together), with my own dataset. I can get the example in the tutorial working, no problem, with the sample dataset provided.
I'm getting a binary cross-entropy error that is negative, and no improvements as epochs progress. I'm pretty sure binary cross-entropy should always be positive, and I should see some improvement in the loss. I've truncated the sample output (and code call) below to 5 epochs. Others seem to run into similar problems sometimes when training CNNs, but I didn't see a clear solution in my case. Does anyone know why this is happening?
Sample output:
Creating TensorFlow device (/gpu:2) -> (device: 2, name: GeForce GTX TITAN Black, pci bus id: 0000:84:00.0) 10240/10240 [==============================] - 2s - loss: -5.5378 - acc: 0.5000 - val_loss: -7.9712 - val_acc: 0.5000 Epoch 2/5 10240/10240 [==============================] - 0s - loss: -7.9712 - acc: 0.5000 - val_loss: -7.9712 - val_acc: 0.5000 Epoch 3/5 10240/10240 [==============================] - 0s - loss: -7.9712 - acc: 0.5000 - val_loss: -7.9712 - val_acc: 0.5000 Epoch 4/5 10240/10240 [==============================] - 0s - loss: -7.9712 - acc: 0.5000 - val_loss: -7.9712 - val_acc: 0.5000 Epoch 5/5 10240/10240 [==============================] - 0s - loss: -7.9712 - acc: 0.5000 - val_loss: -7.9712 - val_acc: 0.5000
My code:
import numpy as np import keras from keras.models import Sequential from keras.layers import Dense from keras.callbacks import History history = History() seed = 7 np.random.seed(seed) dataset = np.loadtxt('train_rows.csv', delimiter=",") #print dataset.shape (10240, 64) # split into input (X) and output (Y) variables X = dataset[:, 0:(dataset.shape[1]-2)] #0:62 (63 of 64 columns) Y = dataset[:, dataset.shape[1]-1] #column 64 counting from 0 #print X.shape (10240, 62) #print Y.shape (10240,) testset = np.loadtxt('test_rows.csv', delimiter=",") #print testset.shape (2560, 64) X_test = testset[:,0:(testset.shape[1]-2)] Y_test = testset[:,testset.shape[1]-1] #print X_test.shape (2560, 62) #print Y_test.shape (2560,) num_units_per_layer = [100, 50] ### create model model = Sequential() model.add(Dense(100, input_dim=(dataset.shape[1]-2), init='uniform', activation='relu')) model.add(Dense(50, init='uniform', activation='relu')) model.add(Dense(1, init='uniform', activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) ## Fit the model model.fit(X, Y, validation_data=(X_test, Y_test), nb_epoch=5, batch_size=128)
The loss is just a scalar that you are trying to minimize. It's not supposed to be positive. One of the reason you are getting negative values in loss is because the training_loss in RandomForestGraphs is implemented using cross entropy loss or negative log liklihood as per the reference code here.
It is no problem to use negative values. I also already did this and it works just fine. I also do not see why it should not work. For calculating the gradient it doesn't matter what value your loss function has.
Yes, it is perfectly fine to use a loss that can become negative.
binary_crossentropy: Used as a loss function for binary classification model. The binary_crossentropy function computes the cross-entropy loss between true labels and predicted labels. categorical_crossentropy: Used as a loss function for multi-class classification model where there are two or more output labels.
I should have printed out my response variable. The categories were labelled as 1 and 2 instead of 0 and 1, which confused the classifier.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With