I'm building a CNN to perform sentiment analysis on Keras. Everything is working perfectly, the model is trained and ready to be launched to production.
However, when I try to predict on new unlabelled data by using the method model.predict()
it only outputs the associated probability. I tried to use the method np.argmax()
but it always outputs 0 even when it should be 1 (on test set, my model achieved 80% of accuracy).
Here is my code to pre-process the data:
# Pre-processing data
x = df[df.Sentiment != 3].Headlines
y = df[df.Sentiment != 3].Sentiment
# Splitting training, validation, testing dataset
x_train, x_validation_and_test, y_train, y_validation_and_test = train_test_split(x, y, test_size=.3,
random_state=SEED)
x_validation, x_test, y_validation, y_test = train_test_split(x_validation_and_test, y_validation_and_test,
test_size=.5, random_state=SEED)
tokenizer = Tokenizer(num_words=NUM_WORDS)
tokenizer.fit_on_texts(x_train)
sequences = tokenizer.texts_to_sequences(x_train)
x_train_seq = pad_sequences(sequences, maxlen=MAXLEN)
sequences_val = tokenizer.texts_to_sequences(x_validation)
x_val_seq = pad_sequences(sequences_val, maxlen=MAXLEN)
sequences_test = tokenizer.texts_to_sequences(x_test)
x_test_seq = pad_sequences(sequences_test, maxlen=MAXLEN)
And here is my model:
MAXLEN = 25
NUM_WORDS = 5000
VECTOR_DIMENSION = 100
tweet_input = Input(shape=(MAXLEN,), dtype='int32')
tweet_encoder = Embedding(NUM_WORDS, VECTOR_DIMENSION, input_length=MAXLEN)(tweet_input)
# Combinating n-gram to optimize results
bigram_branch = Conv1D(filters=100, kernel_size=2, padding='valid', activation="relu", strides=1)(tweet_encoder)
bigram_branch = GlobalMaxPooling1D()(bigram_branch)
trigram_branch = Conv1D(filters=100, kernel_size=3, padding='valid', activation="relu", strides=1)(tweet_encoder)
trigram_branch = GlobalMaxPooling1D()(trigram_branch)
fourgram_branch = Conv1D(filters=100, kernel_size=4, padding='valid', activation="relu", strides=1)(tweet_encoder)
fourgram_branch = GlobalMaxPooling1D()(fourgram_branch)
merged = concatenate([bigram_branch, trigram_branch, fourgram_branch], axis=1)
merged = Dense(256, activation="relu")(merged)
merged = Dropout(0.25)(merged)
output = Dense(1, activation="sigmoid")(merged)
optimizer = optimizers.adam(0.01)
model = Model(inputs=[tweet_input], outputs=[output])
model.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=['accuracy'])
model.summary()
# Training the model
history = model.fit(x_train_seq, y_train, batch_size=32, epochs=5, validation_data=(x_val_seq, y_validation))
I also tried to change the number of activations on the final Dense layer from 1 to 2, but I get an error:
Error when checking target: expected dense_12 to have shape (2,) but got array with shape (1,)
Then we will use the predict_classes method to have Keras make a class prediction for us, and return only a 0 or a 1, which represents the predicted class. Instructor: [00:01] Our neural network takes in four numerical values and predicts a class of zero, if the values are low, and one, if the values are high.
Classification problems are those where the model learns a mapping between input features and an output feature that is a label, such as “ spam ” and “ not spam “. Below is an example of a finalized neural network model in Keras developed for a simple two-class (binary) classification problem.
It is stratified, meaning that it will look at the output values and attempt to balance the number of instances that belong to each class in the k-splits of the data. To use Keras models with scikit-learn, we must use the KerasClassifier wrapper. This class takes a function that creates and returns our neural network model.
The baseline performance of predicting the most prevalent class is a classification accuracy of approximately 65%. Top results achieve a classification accuracy of approximately 77%. We have achieved a relatively better efficiency with a simple neural network when compared to the average results for this dataset.
You are doing binary classification. So you have a Dense layer consisting of one unit with an activation function of sigmoid
. Sigmoid function outputs a value in range [0,1] which corresponds to the probability of the given sample belonging to positive class (i.e. class one). Everything below 0.5 is labeled with zero (i.e. negative class) and everything above 0.5 is labeled with one. So to find the predicted class you can do the following:
preds = model.predict(data)
class_one = preds > 0.5
The true elements of class_one
correspond to samples labeled with one (i.e. positive class).
Bonus: to find the accuracy of your predictions you can easily compare class_one
with the true labels:
acc = np.mean(class_one == true_labels)
Note that I have assumed that true_labels
consists of zeros and ones.
Further, if your model were defined using Sequential class, then you could easily use predict_classes
method:
pred_labels = model.predict_classes(data)
However, since you are using Keras functional API to construct your model (which is a very good thing to do so, in my opinion), you can't use predict_classes
method since it is ill-defined for such models.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With