I'm trying to make a one-class classification convolutional neural network. By one-class I mean I have one image dataset containing about 200 images of Nicolas Cage. By one class classification I mean look at an image and predict 1 if Nicolas Cage is contained in this image and predict 0 Nicolas Cage is not contained in the image.
I’m a definitely a machine learning/deep learning beginner so I was hoping someone with some more knowledge and experience could help guide me in the right direction. Here are my issues and questions right now. My network is performing terribly. I’ve tried making a few predictions with images of Nicolas Cage and it predicts 0 every single time.
Here is a screenshot of what my dataset looks like that I’ve collected use a package called google-images-download. It contains about 200 images of Nicolas Cage. I did two searches to download 500 images. After manually cleaning the images I was down to 200 quality pictures of Nic Cage. Dataset
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Activation
classifier = Sequential()
classifier.add(Conv2D(32, (3, 3), input_shape = (200, 200, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size=(2, 2)))
classifier.add(Conv2D(64, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size=(2, 2)))
classifier.add(Flatten())
classifier.add(Dense(units = 64, activation = 'relu'))
classifier.add(Dropout(0.5))
# output layer
classifier.add(Dense(1))
classifier.add(Activation('sigmoid'))
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('/Users/ginja/Desktop/Code/Nic_Cage/Small_Dataset/train/',
target_size = (200, 200),
batch_size = 32,
class_mode = "binary")
test_set = test_datagen.flow_from_directory('/Users/ginja/Desktop/Code/Nic_Cage/Small_Dataset/test/',
target_size = (200, 200),
batch_size = 32,
class_mode = "binary")
history = classifier.fit_generator(training_set,
steps_per_epoch = 1000,
epochs = 25,
validation_data = test_set,
validation_steps = 500)
Epoch 1/25
1000/1000 [==============================] - 1395s 1s/step - loss: 0.0012 - acc: 0.9994 - val_loss: 1.0000e-07 - val_acc: 1.0000
Epoch 2/25
1000/1000 [==============================] - 1350s 1s/step - loss: 1.0000e-07 - acc: 1.0000 - val_loss: 1.0000e-07 - val_acc: 1.0000
Epoch 3/25
1000/1000 [==============================] - 1398s 1s/step - loss: 1.0000e-07 - acc: 1.0000 - val_loss: 1.0000e-07 - val_acc: 1.0000
Epoch 4/25
1000/1000 [==============================] - 1342s 1s/step - loss: 1.0000e-07 - acc: 1.0000 - val_loss: 1.0000e-07 - val_acc: 1.0000
Epoch 5/25
1000/1000 [==============================] - 1327s 1s/step - loss: 1.0000e-07 - acc: 1.0000 - val_loss: 1.0000e-07 - val_acc: 1.0000
Epoch 6/25
1000/1000 [==============================] - 1329s 1s/step - loss: 1.0000e-07 - acc: 1.0000 - val_loss: 1.0000e-07 - val_acc: 1.0000
.
.
.
The model looks like it converges to a loss value of 1.0000e-07 as this doesn't change for the rest of the epochs
Training and Test accuracy
Training and Test loss
from keras.preprocessing import image
import numpy as np
test_image = image.load_img('/Users/ginja/Desktop/Code/Nic_Cage/nic_cage_predict_1.png', target_size = (200, 200))
#test_image.show()
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = classifier.predict(test_image)
training_set.class_indices
if result[0][0] == 1:
prediction = 'This is Nicolas Cage'
else:
prediction = 'This is not Nicolas Cage'
print(prediction)
We get 'This is not Nicolas Cage' every single time for the prediction. I appreciate anyone that takes the time to even read through this and I appreciate any help on any part of this.
If anyone finds this from google I figured it out. I did a couple of things:
flow_from_directory
it reads the folders in alphanumeric order. So the first folder in the directory will be class "0". Took me way too long to figure that out. path = "/Users/ginja/Desktop/Code/Nic_Cage/Random_images"
for i in range(200):
url = "https://picsum.photos/200/200/?random"
response = requests.get(url)
if response.status_code == 200:
file_name = 'not_nicolas_{}.jpg'.format(i)
file_path = path + "/" + file_name
with open(file_path, 'wb') as f:
print("saving: " + file_name)
f.write(response.content)
I added shuffle = True
as a parameter in the flow_from_directory to shuffle our images to allow our network to generalize better
I now have a training accuracy of 99% and a Test accuracy of 91% and I am able to predict images of Nicolas Cage successfully!
Everyone leans towards a binary classification approach. This may be a solution but removes the fundamental design objective which may be to solve it with a one class classifier. Depending on what you want to achieve with a one-class classifier it can be an ill-conditioned problem. In my experience, your last point often applies.
As mentioned in https://arxiv.org/pdf/1801.05365.pdf:
In the classical multiple-class classification, features are learned with the objective of maximizing inter-class distances between classes and minimizing intra-class variances within classes [2]. How-ever, in the absence of multiple classes such a discriminative approach is not possible.
It yields a trivial solution. The reason why is explained a bit later:
The reason why this approach ends up yielding a trivial solution is due to the absence of a regularizing term in the loss function that takes into account the discriminative ability of the network. For example, since all class labels are identical, a zero loss can be obtained by making all weights equal to zero. It is true that this is a valid solution in the closed world where onlynormal chairobjects exist. But such a network has zero discriminative ability whenabnormal chairobjects appear
Note that the description here is made with regards to attempting to use one class classifiers to solve for different classes. One other useful objective of one class classifiers is to detect anomaly in e.g. factory operation signals. This is what I am currently working on. In such cases, having knowledge regarding the various damage states is very hard to obtain. It would be ridiculous to break a machine just to see how it operates when broken so that a decent multinomial classifier can be made. One solution to the problem is described in the following: https://arxiv.org/abs/1912.12502. Note that in this paper, because of the stochastic similarity of the classes, the descriminative capacity of classes is achieved as well.
I found that by following the guidelines described and specially, removing the last activation function, I got my one-class classifier working and the acuraccy did not give 0 values. Note that in your case you may also want to remove to binary-cross entropy since that requires binary inputs to make sense (use RMSE).
This method should also work for your case. In that case the network would be capable of determining which photos are numerically further away from the training photo class. In my experience however, it is likely still a hard problem to solve due to the variance contained in the pictures e.g. different background, angles, etc... To that end, the problem I am solving is much easier as there is much more similarity between operating conditions of the same condition stage. To put that into analogy, in my case the training class is more like the same picture with different noise levels and only slight movements of objects.
Treating your problem as supervised problem:
You are solving a face recognition problem. Your problem is binary classification problem if you want to distinguish between "Nicolas Cage" or any other random image. For binary classification you need to have a class with 0 label or not "Nicolas Cage" class.
If I take a very famous example then it is Hotdog-Not-Hotdog problem (Silicon Valley). These links might help you.
https://towardsdatascience.com/building-the-hotdog-not-hotdog-classifier-from-hbos-silicon-valley-c0cb2317711f
https://github.com/J-Yash/Hotdog-Not-Hotdog/blob/master/Hotdog_classifier_transfer_learning.ipynb
Treating your problem as Unsupervised problem:
In this you can represent your image into an embedding vector. Pass your Nicolas Cage image into a pre-trained facenet that will give you face embedding and plot that embedding to see the relation between every image.
https://paperswithcode.com/paper/facenet-a-unified-embedding-for-face
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With