Keras binary classification different dataset same prediction results

Question

I have 2 values for predict label, -1 or 1. The learning looks good with LSTM or with Dense, but the prediction is always the same with different predict datasets, changing the layers to Dense does not change the prediction, maybe I am doing something wrong.

Here is the code

// set up data arrays
float[,,] training_data = new float[training.Count(), 12, 200];
float[,,] testing_data = new float[testing.Count(), 12, 200];
float[,,] predict_data = new float[1, 12, 200];

IList<float> training_labels = new List<float>();
IList<float> testing_labels = new List<float>();

// Load Data and add to arrays
...
...

/////////////////////////
NDarray train_y = np.array(training_labels.ToArray());
NDarray train_x = np.array(training_data);

NDarray test_y = np.array(testing_labels.ToArray());
NDarray test_x = np.array(testing_data);

NDarray predict_x = np.array(predict_data);

train_y = Util.ToCategorical(train_y, 2);
test_y = Util.ToCategorical(test_y, 2);

//Build functional model
var model = new Sequential();

model.Add(new Input(shape: new Keras.Shape(12, 200)));
model.Add(new BatchNormalization());

model.Add(new LSTM(128, activation: "tanh", recurrent_activation: "sigmoid", return_sequences: false));            
model.Add(new Dropout(0.2));
model.Add(new Dense(32, activation: "relu"));            
model.Add(new Dense(2, activation: "softmax"));

model.Compile(optimizer: new SGD(), loss: "binary_crossentropy", metrics: new string[] { "accuracy" });
model.Summary();

var history = model.Fit(train_x, train_y, batch_size: 1, epochs: 1, verbose: 1, validation_data: new NDarray[] { test_x, test_y });

var score = model.Evaluate(test_x, test_y, verbose: 2);
Console.WriteLine($"Test loss: {score[0]}");
Console.WriteLine($"Test accuracy: {score[1]}");

NDarray predicted=model.Predict(predict_x, verbose: 2);
                    
Console.WriteLine($"Prediction: {predicted[0][0]*100}");
Console.WriteLine($"Prediction: {predicted[0][1]*100}");

And this is the ouput

    483/483 [==============================] 
    - 9s 6ms/step - loss: 0.1989 - accuracy: 0.9633 - val_loss: 0.0416 - val_accuracy: 1.0000
      4/4 - 0s - loss: 0.0416 - accuracy: 1.0000
    Test loss: 0.04155446216464043
    Test accuracy: 1
    1/1 - 0s

    Prediction: 0.0010418787496746518
    Prediction: 99.99896287918091

The same predict data used in ML.net gives a different results, but with ML.Net the accuracy is only 0.6, that's why I need a deep learning

M.Innat · Accepted Answer

I didn't have c# set up to reproduce your code. But I see one small issue that you may need to consider (not sure if this caused the trouble). According to your above code set up, I think you're using the wrong loss function for training. As you set,

Util.ToCategorical(train_y, 2);
model.Add(new Dense(2, activation: "softmax"));

Then your loss function should be 'categorical_crossentropy' and should be not 'binary_crossentropy'. Because, you transformed your labels (-1, 1) to a one-hot encoded vector and set softmax activation in your last layer.

However, as you said, your labels are -1 and 1; so if you treat your problem as a binary classification problem, then the set up should be something like as follows:

# Util.ToCategorical(train_y, 2); # no transformation 
model.Add(new Dense(1, activation: "sigmoid"));
model.compile(..., loss: "binary_crossentropy" )

Reference.

Neural Network and Binary classification Guidance
Selecting loss and metrics for Tensorflow model

Update

Here I will give some working demo code for better understanding. But before that, here is one small note. Let's say, we have a training data set and labels start from < 0 or minus value, [-2, -1, 0, 1] for example. And to transform this integer value into a one-hot encoded vector, we can pick either tf.keras.utils.to_categorical or pd.get_dummies function. But a small difference between these two methods is, in tf..to_categorical, our integer label must start from 0; which is not in the case of pd.get_dummies, please check my other answers on this. Shortly,

np.random.randint(-1, 1, size=(80))
array([-1, -1,  0,  0,  0 .. ]

pd.get_dummies(a).astype('float32').values[:5] 
array([[1., 0.],
       [1., 0.],
       [0., 1.],
       [0., 1.],
       [0., 1.]], dtype=float32)

tf.keras.utils.to_categorical(a+1, num_classes = 2)[:5]
array([[1., 0.],
       [1., 0.],
       [0., 1.],
       [0., 1.],
       [0., 1.]], dtype=float32)

Okay, I'm giving now some working demo code.

img = tf.random.normal([80, 32], 0, 1, tf.float32)
tar = pd.get_dummies(np.random.randint(-1, 1,  # mine: [-1, 1) - yours: [-1, 1]
                                       size=80)).astype('float32').values 

model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10, input_dim = 32, 
                       kernel_initializer ='normal', 
                       activation= 'relu'))
model.add(tf.keras.layers.Dense(2, activation='softmax'))

model.compile(loss='categorical_crossentropy', 
              optimizer='adam', metrics=['accuracy'])
model.fit(img, tar, epochs=10, verbose=2)

Epoch 1/10
3/3 - 0s - loss: 0.7610 - accuracy: 0.4375
Epoch 2/10
3/3 - 0s - loss: 0.7425 - accuracy: 0.4375
....
Epoch 8/10
3/3 - 0s - loss: 0.6694 - accuracy: 0.5125
Epoch 9/10
3/3 - 0s - loss: 0.6601 - accuracy: 0.5750
Epoch 10/10
3/3 - 0s - loss: 0.6511 - accuracy: 0.5750

Inference

loss, acc = model.evaluate(img, tar); print(loss, acc)
pred = model.predict(img); print(pred[:5])

3ms/step - loss: 0.6167 - accuracy: 0.7250
0.6166597604751587 0.7250000238418579

# probabilities of the predicted labels -1 and 0
[[0.35116166 0.64883834]
 [0.5542663  0.4457338 ]
 [0.28023133 0.71976864]
 [0.5024315  0.49756846]
 [0.41029742 0.5897026 ]]

Now, if we do

print(pred[0])
pred[0].argmax(-1) # expect: -1, 0 as our label 

[0.35116166 0.64883834]
1

It gives 0.35x and 0.64x for the target label -1 and 0 respectively. But, when we did .argmax for the predicted label from probabilities, it returns zero-indexed highest values; (a reason to make the training labels start from zero indexes, and so I think in your case it's better to transform [-1, 1] to [0, 1]).

Okay, lastly, as you mentioned that, you want predicted label and corresponding confidences scores; and to do that, we can use tf.math.top_k with k = num_of_class.

top_k_values, top_k_indices = tf.math.top_k(pred, k=2)
for values, indices in zip(top_k_values, top_k_indices):
    print(
        'For class {}, model confidence {:.2f}%'
        .format(indices.numpy()[0]-1, values.numpy()[0]*100)
        )
    
    print(
        'For class {}, model confidence {:.2f}%'
        .format(indices.numpy()[1]-1, values.numpy()[1]*100)
        )
    
    '''
    Note: above we substract -1 to match with 
          the target label (-1, 0)

    And it would not necessary if we initially -
    transform our label from (-1, 0) to (0, 1), i.e. start from zero 
    '''
    print()
    break # remove for full results

For class 0, model confidence 64.88%
For class -1, model confidence 35.12%

Verifying the score order

# pick first samples: input and label
model(img)[0].numpy(), tar[0]

(array([0.35116166, 0.64883834], dtype=float32),
 array([0., 1.], dtype=float32))

Here, 
0: for -1
1: for 0

# Again, better to transform (-1, 0) to (0, 1) at initial.

Keras binary classification different dataset same prediction results

Tags:

c#

classification

keras

Mario

1 Answers

Update

M.Innat

Recent Activity

Donate For Us

Keras binary classification different dataset same prediction results

Tags:

c#

classification

keras

Mario

1 Answers

Update

M.Innat

Related questions

Recent Activity

Donate For Us