Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Will jpeg compression affect training and classification using Convolutional Neural Networks

We are working with a company that has more than 2 million images in jpeg. They want to collect more images. The purpose of the images are machine classification and to find small objects like bolts and small water leaks. the number of images are high, but the examples for training are small, maybe only 100 samples or less.

Our suggestion to the company is to store the data in the original 10 or 12 bit png/tiff format uncompressed. they want to use the jpeg format since they can collect more data in a shorter time (4 images pre second) and do not need all that disk space.

does anyone know how storage of jpeg compared to png format will affect both training of samples and then finding/classification later?

I have searched with Google. It returns many answers on how you can improve jpeg quality by using deep learning. Rest of the answers is about how to process cats and dogs using libraries on the internet. There is one article that say that jpeg compression affects the recognition, but very little about what sort of images, what type of objects you look for etc.

When you look for large objects like dogs and cats, you will have many features, curves, colours, histograms and other features that can be used. Looking for very small objects with few characteristics is more complex.

Does anyone know of any article about this subject? Key question: Should I store my images in png or lossless tiff or can I use jpeg compression for later use in deep learning?

like image 812
User1953 Avatar asked Nov 26 '17 14:11

User1953


People also ask

Can CNN be used for image classification?

The Convolutional Neural Network (CNN or ConvNet) is a subtype of Neural Networks that is mainly used for applications in image and speech recognition. Its built-in convolutional layer reduces the high dimensionality of images without losing its information. That is why CNNs are especially suited for this use case.

Which image resolution should I use for training for deep neural network?

My recommendation for this dataset is to start training the neural network with image size 300 and progressively increase it to 400 and finish it with size 500. By this way, the model should be able to generalize well for different image resolutions.

How many images do you need to train a convolutional neural network?

Usually around 100 images are sufficient to train a class. If the images in a class are very similar, fewer images might be sufficient. the training images are representative of the variation typically found within the class.


2 Answers

TL;DR: Yes, but not that much. Unless you are considering <10 JPEG quality parameter, you should be safe.

Longer version:

I highly recommend an article called Understanding How Image Quality Affects Deep Neural Networks. As you may guess authors checked how different distortions (JPEG, JPEG 2000, blur, and noise) affect the performance of usual CNN architectures (VGG, AlexNet, GoogLeNet).

Apparently, all tested nets perform in a similar way and only severe JPEG compressions (quality < 10) can hurt them.

The only thing is that nothing from ResNet family was tested, but I don't see why it can be drastically different.

like image 90
iezepov Avatar answered Oct 28 '22 08:10

iezepov


You can try it by training your network at first. As you have so few dataset, I would suggest to either increase the dataset or try to use another approach like unsupervised learning / reinforcement learning etc.

About the quality loss, you can make a quick experiment. Take an image and save it as jpg and png. Then, load both of them as array and check difference and visualize it. You will notice that it will look like noise on the image.

So, what it means?

If your inference success rate is getting affected by even this much noise, you are better to take some precautions to prevent overfitting. We expect that good CNN designs learn 'meaningful features' and suppresses 'noises' in an image.

Go for jpg and enhance your network's overfitting issues if any.

like image 26
Deniz Beker Avatar answered Oct 28 '22 09:10

Deniz Beker