Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras predict() returns a better accuracy than evaluate()

I set up a model with Keras, then I trained it on a dataset of 3 records and finally I tested the resulting model with evaluate() and predict(), using the same test set for both functions (the test set has 100 records and it doesn't have any record of the training set, as much as it can be relevant, given the size of the two datasets). The dataset is composed by 5 files, where 4 files represent each one a different temperature sensor, that each minute collects 60 measurements (each row contains 60 measurements), while the last file contains the class labels that I want to predict (in particular, 3 classes: 3, 20 or 100).

This is the model I'm using:

n_sensors, t_periods = 4, 60

model = Sequential()

model.add(Conv1D(100, 6, activation='relu', input_shape=(t_periods, n_sensors)))

model.add(Conv1D(100, 6, activation='relu'))

model.add(MaxPooling1D(3))

model.add(Conv1D(160, 6, activation='relu'))

model.add(Conv1D(160, 6, activation='relu'))

model.add(GlobalAveragePooling1D())

model.add(Dropout(0.5))

model.add(Dense(3, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

That I train: self.model.fit(X_train, y_train, batch_size=3, epochs=5, verbose=1)

Then I use evaluate: self.model.evaluate(x_test, y_test, verbose=1)

And predict:

predictions = self.model.predict(data)
result = np.where(predictions[0] == np.amax(predictions[0]))
if result[0][0] == 0:
    return '3'
elif result[0][0] == 1:
    return '20'
else:
    return '100'

For each class predicted, I confront it with the actual label, and then I calculate correct guesses / total examples, that should be equivalent to accuracy from the evaluate() function. Here's the code:

correct = 0
for profile in self.profile_file: #profile_file is an opened file
    ts1 = self.ts1_file.readline()
    ts2 = self.ts2_file.readline()
    ts3 = self.ts3_file.readline()
    ts4 = self.ts4_file.readline()
    data = ts1, ts2, ts3, ts4
    test_data = self.dl.transform(data) # see the last block of code I posted
    prediction = self.model.predict(test_data)
    if prediction == label:
       correct += 1
acc = correct / 100 # 100 is the number of total examples

Data feeded to evaluate() is taken from this function:

label = pd.read_csv(os.path.join(self.testDir, 'profile.txt'), sep='\t', header=None)
label = np_utils.to_categorical(label[0].factorize()[0])
data = [os.path.join(self.testDir,'TS2.txt'),os.path.join(self.testDir, 'TS1.txt'),os.path.join(self.testDir,'TS3.txt'),os.path.join(self.testDir, 'TS4.txt')]
df = pd.DataFrame()
for txt in data:
    read_df = pd.read_csv(txt, sep='\t', header=None)
    df = df.append(read_df)
df = df.apply(self.__predict_scale)
df = df.sort_index().values.reshape(-1,4,60).transpose(0,2,1)
return df, label

While data feeded to predict() is taken from this other one:

df = pd.DataFrame()
for txt in data: # data 
    read_df = pd.read_csv(StringIO(txt), sep='\t', header=None)
    df = df.append(read_df)
df = df.apply(self.__predict_scale)
df = df.sort_index().values.reshape(-1,4,60).transpose(0,2,1)
return df

Accuracies yielded by evaluate() and predict() are always different: in particular, the maximum difference I noted was when evaluate() resulted in a 78% accuracy while predict() in a 95% accuracy. The only difference between the two functions is that I make predict() work on an example at a time, while evaluate() takes the entire dataset all at once, but it should result in no difference. How can it be?

UPDATE 1: It seems that the problem is in how I prepare my data. In the case of predict(), I transform only one line at a time from each file using the last block of code I posted, while in feeding evaluate(), I transform the entire files using the other function reported. Why should it be different? It seems to me that I'm applying the exact same transformation, the only difference is in the number of rows transformed.

like image 474
DDD Avatar asked Sep 09 '19 23:09

DDD


People also ask

What is difference between evaluate and predict in keras?

The keras. evaluate() function will give you the loss value for every batch. The keras. predict() function will give you the actual predictions for all samples in a batch, for all batches.

What does keras model predict return?

Model. predict passes the input vector through the model and returns the output tensor for each datapoint. Since the last layer in your model is a single Dense neuron, the output for any datapoint is a single value.

How do you predict accuracy from a model?

Accuracy is a metric used in classification problems used to tell the percentage of accurate predictions. We calculate it by dividing the number of correct predictions by the total number of predictions. This formula provides an easy-to-understand definition that assumes a binary classification problem.

What is the difference between keras evaluate() and keras predict()?

keras.evaluate () is for evaluating your trained model. Its output is accuracy or loss, not prediction to your input data. keras.predict () actually predicts, and its output is target value, predicted from your input data. What is the difference between class_weight and sample_weight in keras ?

What is binary accuracy in keras?

BinaryAccuracy class tf.keras.metrics.BinaryAccuracy(name="binary_accuracy", dtype=None, threshold=0.5) Calculates how often predictions match binary labels. This metric creates two local variables, total and count that are used to compute the frequency with which y_pred matches y_true.

What is the use of fit evaluation in keras?

Evaluation is a process during development of the model to check whether the model is best fit for the given problem and corresponding data. Keras model provides a function, evaluate which does the evaluation of the model. It has three main arguments, Test data. Test data label.

How do I reinstantiate a model in keras?

You can then use keras.models.load_model (filepath) to reinstantiate your model. load_model will also take care of compiling the model using the saved training configuration (unl What is the performance comparison, between Keras, TFLayer and TensorFlow?


Video Answer


1 Answers

This question was already answered here

what happens is when you evaluate the model, since your loss function is categorical_crossentropy, metrics=['accuracy'] calculates categorical_accuracy.

But predict has a default set to binary_accuracy.

So essentially you are calculating categorical accuracy with evaluate and and binary accuracy with predict. this is the reason they are so widely different.

the difference between categorical_accuracy and binary_accuracy is that categorical_accuracy check if all the outputs match with your y_test and binary_accuracy checks if each of you outputs matches with your y_test.

Example(single row):

prediction = [0,0,1,1,0]
y_test = [0,0,0,1,0]

categorical_accuracy = 0% 

since 1 output does not match the categorical_accuracy is 0

binary_accuracy = 80% 

even though 1 output doesn't match the rest of 80% do match so accuracy is 80%

like image 77
sparkles Avatar answered Nov 14 '22 23:11

sparkles