I have 2 values for predict label, -1 or 1.
The learning looks good with LSTM or with Dense, but the prediction is always the same with different predict datasets, changing the layers to Dense does not change the prediction, maybe I am doing something wrong.
Here is the code
// set up data arrays
float[,,] training_data = new float[training.Count(), 12, 200];
float[,,] testing_data = new float[testing.Count(), 12, 200];
float[,,] predict_data = new float[1, 12, 200];
IList<float> training_labels = new List<float>();
IList<float> testing_labels = new List<float>();
// Load Data and add to arrays
...
...
/////////////////////////
NDarray train_y = np.array(training_labels.ToArray());
NDarray train_x = np.array(training_data);
NDarray test_y = np.array(testing_labels.ToArray());
NDarray test_x = np.array(testing_data);
NDarray predict_x = np.array(predict_data);
train_y = Util.ToCategorical(train_y, 2);
test_y = Util.ToCategorical(test_y, 2);
//Build functional model
var model = new Sequential();
model.Add(new Input(shape: new Keras.Shape(12, 200)));
model.Add(new BatchNormalization());
model.Add(new LSTM(128, activation: "tanh", recurrent_activation: "sigmoid", return_sequences: false));
model.Add(new Dropout(0.2));
model.Add(new Dense(32, activation: "relu"));
model.Add(new Dense(2, activation: "softmax"));
model.Compile(optimizer: new SGD(), loss: "binary_crossentropy", metrics: new string[] { "accuracy" });
model.Summary();
var history = model.Fit(train_x, train_y, batch_size: 1, epochs: 1, verbose: 1, validation_data: new NDarray[] { test_x, test_y });
var score = model.Evaluate(test_x, test_y, verbose: 2);
Console.WriteLine($"Test loss: {score[0]}");
Console.WriteLine($"Test accuracy: {score[1]}");
NDarray predicted=model.Predict(predict_x, verbose: 2);
Console.WriteLine($"Prediction: {predicted[0][0]*100}");
Console.WriteLine($"Prediction: {predicted[0][1]*100}");
And this is the ouput
483/483 [==============================]
- 9s 6ms/step - loss: 0.1989 - accuracy: 0.9633 - val_loss: 0.0416 - val_accuracy: 1.0000
4/4 - 0s - loss: 0.0416 - accuracy: 1.0000
Test loss: 0.04155446216464043
Test accuracy: 1
1/1 - 0s
Prediction: 0.0010418787496746518
Prediction: 99.99896287918091
The same predict data used in ML.net gives a different results, but with ML.Net the accuracy is only 0.6, that's why I need a deep learning
I didn't have c# set up to reproduce your code. But I see one small issue that you may need to consider (not sure if this caused the trouble). According to your above code set up, I think you're using the wrong loss function for training. As you set,
Util.ToCategorical(train_y, 2);
model.Add(new Dense(2, activation: "softmax"));
Then your loss function should be 'categorical_crossentropy' and should be not 'binary_crossentropy'. Because, you transformed your labels (-1, 1) to a one-hot encoded vector and set softmax activation in your last layer.
However, as you said, your labels are -1 and 1; so if you treat your problem as a binary classification problem, then the set up should be something like as follows:
# Util.ToCategorical(train_y, 2); # no transformation
model.Add(new Dense(1, activation: "sigmoid"));
model.compile(..., loss: "binary_crossentropy" )
Reference.
Here I will give some working demo code for better understanding. But before that, here is one small note. Let's say, we have a training data set and labels start from < 0 or minus value, [-2, -1, 0, 1] for example. And to transform this integer value into a one-hot encoded vector, we can pick either tf.keras.utils.to_categorical or pd.get_dummies function. But a small difference between these two methods is, in tf..to_categorical, our integer label must start from 0; which is not in the case of pd.get_dummies, please check my other answers on this. Shortly,
np.random.randint(-1, 1, size=(80))
array([-1, -1, 0, 0, 0 .. ]
pd.get_dummies(a).astype('float32').values[:5]
array([[1., 0.],
[1., 0.],
[0., 1.],
[0., 1.],
[0., 1.]], dtype=float32)
tf.keras.utils.to_categorical(a+1, num_classes = 2)[:5]
array([[1., 0.],
[1., 0.],
[0., 1.],
[0., 1.],
[0., 1.]], dtype=float32)
Okay, I'm giving now some working demo code.
img = tf.random.normal([80, 32], 0, 1, tf.float32)
tar = pd.get_dummies(np.random.randint(-1, 1, # mine: [-1, 1) - yours: [-1, 1]
size=80)).astype('float32').values
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10, input_dim = 32,
kernel_initializer ='normal',
activation= 'relu'))
model.add(tf.keras.layers.Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam', metrics=['accuracy'])
model.fit(img, tar, epochs=10, verbose=2)
Epoch 1/10
3/3 - 0s - loss: 0.7610 - accuracy: 0.4375
Epoch 2/10
3/3 - 0s - loss: 0.7425 - accuracy: 0.4375
....
Epoch 8/10
3/3 - 0s - loss: 0.6694 - accuracy: 0.5125
Epoch 9/10
3/3 - 0s - loss: 0.6601 - accuracy: 0.5750
Epoch 10/10
3/3 - 0s - loss: 0.6511 - accuracy: 0.5750
Inference
loss, acc = model.evaluate(img, tar); print(loss, acc)
pred = model.predict(img); print(pred[:5])
3ms/step - loss: 0.6167 - accuracy: 0.7250
0.6166597604751587 0.7250000238418579
# probabilities of the predicted labels -1 and 0
[[0.35116166 0.64883834]
[0.5542663 0.4457338 ]
[0.28023133 0.71976864]
[0.5024315 0.49756846]
[0.41029742 0.5897026 ]]
Now, if we do
print(pred[0])
pred[0].argmax(-1) # expect: -1, 0 as our label
[0.35116166 0.64883834]
1
It gives 0.35x and 0.64x for the target label -1 and 0 respectively. But, when we did .argmax for the predicted label from probabilities, it returns zero-indexed highest values; (a reason to make the training labels start from zero indexes, and so I think in your case it's better to transform [-1, 1] to [0, 1]).
Okay, lastly, as you mentioned that, you want predicted label and corresponding confidences scores; and to do that, we can use tf.math.top_k with k = num_of_class.
top_k_values, top_k_indices = tf.math.top_k(pred, k=2)
for values, indices in zip(top_k_values, top_k_indices):
print(
'For class {}, model confidence {:.2f}%'
.format(indices.numpy()[0]-1, values.numpy()[0]*100)
)
print(
'For class {}, model confidence {:.2f}%'
.format(indices.numpy()[1]-1, values.numpy()[1]*100)
)
'''
Note: above we substract -1 to match with
the target label (-1, 0)
And it would not necessary if we initially -
transform our label from (-1, 0) to (0, 1), i.e. start from zero
'''
print()
break # remove for full results
For class 0, model confidence 64.88%
For class -1, model confidence 35.12%
Verifying the score order
# pick first samples: input and label
model(img)[0].numpy(), tar[0]
(array([0.35116166, 0.64883834], dtype=float32),
array([0., 1.], dtype=float32))
Here,
0: for -1
1: for 0
# Again, better to transform (-1, 0) to (0, 1) at initial.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With