Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

One class classification using Keras and Python

Intro and questions:

I'm trying to make a one-class classification convolutional neural network. By one-class I mean I have one image dataset containing about 200 images of Nicolas Cage. By one class classification I mean look at an image and predict 1 if Nicolas Cage is contained in this image and predict 0 Nicolas Cage is not contained in the image.

I’m a definitely a machine learning/deep learning beginner so I was hoping someone with some more knowledge and experience could help guide me in the right direction. Here are my issues and questions right now. My network is performing terribly. I’ve tried making a few predictions with images of Nicolas Cage and it predicts 0 every single time.

  • Should I collect more data for this to work? I’m performing data augmentations with a small dataset of 207 images. I was hoping the data augmentations would help the network generalize but I think I was wrong
  • Should I try tweaking the amount of epochs, step per epoch, val steps, or the optimization algorithm I’m using for gradient descent? I’m using Adam but I was thinking maybe I should try stochastic gradient descent with different learning rates?
  • Should I add more convolution or dense layers to help my network better generalize and learn?
  • Should I just stop trying to do one class classification and go to normal binary classification because using a neural network with one class classification is not very feasible? I saw this post here one class classification with keras and it seems like the OP ended up using an isolation forest. So I guess I could try using some convolutional layers and feed into an isolation forest or an SVM? I could not find a lot of info or tutorials about people using isolation forests with one-class image classification.

Dataset:

Here is a screenshot of what my dataset looks like that I’ve collected use a package called google-images-download. It contains about 200 images of Nicolas Cage. I did two searches to download 500 images. After manually cleaning the images I was down to 200 quality pictures of Nic Cage. Dataset


The imports and model:

from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Activation

classifier = Sequential()

classifier.add(Conv2D(32, (3, 3), input_shape = (200, 200, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))

classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size=(2, 2)))

classifier.add(Conv2D(64, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size=(2, 2)))

classifier.add(Flatten())

classifier.add(Dense(units = 64, activation = 'relu'))

classifier.add(Dropout(0.5))

# output layer
classifier.add(Dense(1))
classifier.add(Activation('sigmoid'))

Compiling and image augmentation

classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])


from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory('/Users/ginja/Desktop/Code/Nic_Cage/Small_Dataset/train/',
                                                 target_size = (200, 200),
                                                 batch_size = 32,
                                                 class_mode = "binary")

test_set = test_datagen.flow_from_directory('/Users/ginja/Desktop/Code/Nic_Cage/Small_Dataset/test/',
                                            target_size = (200, 200),
                                            batch_size = 32,
                                            class_mode = "binary")

Fitting the model

history = classifier.fit_generator(training_set,
                         steps_per_epoch = 1000,
                         epochs = 25,
                         validation_data = test_set,
                         validation_steps = 500)

Epoch 1/25
1000/1000 [==============================] - 1395s 1s/step - loss: 0.0012 - acc: 0.9994 - val_loss: 1.0000e-07 - val_acc: 1.0000
Epoch 2/25
1000/1000 [==============================] - 1350s 1s/step - loss: 1.0000e-07 - acc: 1.0000 - val_loss: 1.0000e-07 - val_acc: 1.0000
Epoch 3/25
1000/1000 [==============================] - 1398s 1s/step - loss: 1.0000e-07 - acc: 1.0000 - val_loss: 1.0000e-07 - val_acc: 1.0000
Epoch 4/25
1000/1000 [==============================] - 1342s 1s/step - loss: 1.0000e-07 - acc: 1.0000 - val_loss: 1.0000e-07 - val_acc: 1.0000
Epoch 5/25
1000/1000 [==============================] - 1327s 1s/step - loss: 1.0000e-07 - acc: 1.0000 - val_loss: 1.0000e-07 - val_acc: 1.0000
Epoch 6/25
1000/1000 [==============================] - 1329s 1s/step - loss: 1.0000e-07 - acc: 1.0000 - val_loss: 1.0000e-07 - val_acc: 1.0000
.
.
.

The model looks like it converges to a loss value of 1.0000e-07 as this doesn't change for the rest of the epochs


Training and Test accuracy plotted

Training and Test accuracy

Training and Test loss plotted

Training and Test loss


Making the prediction

from keras.preprocessing import image
import numpy as np 

test_image = image.load_img('/Users/ginja/Desktop/Code/Nic_Cage/nic_cage_predict_1.png', target_size = (200, 200))
#test_image.show()
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = classifier.predict(test_image)
training_set.class_indices
if result[0][0] == 1:
    prediction = 'This is Nicolas Cage'
else:
    prediction = 'This is not Nicolas Cage'

print(prediction)

We get 'This is not Nicolas Cage' every single time for the prediction. I appreciate anyone that takes the time to even read through this and I appreciate any help on any part of this.

like image 882
Drew Scatterday Avatar asked Aug 01 '19 13:08

Drew Scatterday


3 Answers

If anyone finds this from google I figured it out. I did a couple of things:

  1. I added a dataset of random images to my train and test folders. I basically added a "0" class. These images were labeled as "not_nicolas" I downloaded the same amount of images I had in the first dataset which was about 200 images. So I had 200 images of Nicolas Cage and 200 images of random stuff. The random pictures were generated at this link https://picsum.photos/200/200/?random I just used a python script to generate 200 images. Make sure when you use flow_from_directory it reads the folders in alphanumeric order. So the first folder in the directory will be class "0". Took me way too long to figure that out.
path = "/Users/ginja/Desktop/Code/Nic_Cage/Random_images"

for i in range(200):
    url = "https://picsum.photos/200/200/?random"
    response = requests.get(url)
    if response.status_code == 200:
        file_name = 'not_nicolas_{}.jpg'.format(i)
        file_path = path + "/" + file_name
        with open(file_path, 'wb') as f:
            print("saving: " + file_name)
            f.write(response.content)
  1. I changed the optimizer to Stochastic Gradient Descent instead of Adam.
  2. I added shuffle = True as a parameter in the flow_from_directory to shuffle our images to allow our network to generalize better

    I now have a training accuracy of 99% and a Test accuracy of 91% and I am able to predict images of Nicolas Cage successfully!

like image 131
Drew Scatterday Avatar answered Nov 15 '22 04:11

Drew Scatterday


Everyone leans towards a binary classification approach. This may be a solution but removes the fundamental design objective which may be to solve it with a one class classifier. Depending on what you want to achieve with a one-class classifier it can be an ill-conditioned problem. In my experience, your last point often applies.

As mentioned in https://arxiv.org/pdf/1801.05365.pdf:

In the classical multiple-class classification, features are learned with the objective of maximizing inter-class distances between classes and minimizing intra-class variances within classes [2]. How-ever, in the absence of multiple classes such a discriminative approach is not possible.

It yields a trivial solution. The reason why is explained a bit later:

The reason why this approach ends up yielding a trivial solution is due to the absence of a regularizing term in the loss function that takes into account the discriminative ability of the network. For example, since all class labels are identical, a zero loss can be obtained by making all weights equal to zero. It is true that this is a valid solution in the closed world where onlynormal chairobjects exist. But such a network has zero discriminative ability whenabnormal chairobjects appear

Note that the description here is made with regards to attempting to use one class classifiers to solve for different classes. One other useful objective of one class classifiers is to detect anomaly in e.g. factory operation signals. This is what I am currently working on. In such cases, having knowledge regarding the various damage states is very hard to obtain. It would be ridiculous to break a machine just to see how it operates when broken so that a decent multinomial classifier can be made. One solution to the problem is described in the following: https://arxiv.org/abs/1912.12502. Note that in this paper, because of the stochastic similarity of the classes, the descriminative capacity of classes is achieved as well.

I found that by following the guidelines described and specially, removing the last activation function, I got my one-class classifier working and the acuraccy did not give 0 values. Note that in your case you may also want to remove to binary-cross entropy since that requires binary inputs to make sense (use RMSE).

This method should also work for your case. In that case the network would be capable of determining which photos are numerically further away from the training photo class. In my experience however, it is likely still a hard problem to solve due to the variance contained in the pictures e.g. different background, angles, etc... To that end, the problem I am solving is much easier as there is much more similarity between operating conditions of the same condition stage. To put that into analogy, in my case the training class is more like the same picture with different noise levels and only slight movements of objects.

like image 25
Sebastiaan van Baars Avatar answered Nov 15 '22 04:11

Sebastiaan van Baars


Treating your problem as supervised problem:

You are solving a face recognition problem. Your problem is binary classification problem if you want to distinguish between "Nicolas Cage" or any other random image. For binary classification you need to have a class with 0 label or not "Nicolas Cage" class.

If I take a very famous example then it is Hotdog-Not-Hotdog problem (Silicon Valley). These links might help you.

https://towardsdatascience.com/building-the-hotdog-not-hotdog-classifier-from-hbos-silicon-valley-c0cb2317711f

https://github.com/J-Yash/Hotdog-Not-Hotdog/blob/master/Hotdog_classifier_transfer_learning.ipynb

Treating your problem as Unsupervised problem:

In this you can represent your image into an embedding vector. Pass your Nicolas Cage image into a pre-trained facenet that will give you face embedding and plot that embedding to see the relation between every image.

https://paperswithcode.com/paper/facenet-a-unified-embedding-for-face

like image 25
Tushar Gupta Avatar answered Nov 15 '22 04:11

Tushar Gupta