Image resizing method during preprocessing for neural network

Tags:

I am new to machine learning. I am trying to create an input matrix (X) from a set of images (Stanford dog set of 120 breeds) to train a convolutional neural network. I aim to resize images and turn each image into one row by making each pixel a separate column.

If I directly resize images to a fixed size, the images lose their originality due to squishing or stretching, which is not good (first solution).

I can resize by fixing either width or height and then crop it (all resultant images will be of the same size as 100x100), but critical parts of the image can be cropped (second solution).

I am thinking of another way of doing it, but I am sure. Assume I want 10000 columns per image. Instead of resizing images to 100x100, I will resize the image so that the total pixel count will be around 10000 pixels. So, images of size 50x200, 100x100 and 250x40 will all converted into 10000 columns. For other sizes like 52x198, the first 10000 pixels out of 10296 will be considered (third solution).

The third solution I mentioned above seems to preserve the original shape of the image. However, it may be losing all of this originality while converting into a row since not all images are of the same size. I wonder about your comments on this issue. It will also be great if you can direct me to sources I can learn about the topic.

475

asked Dec 12 '16 13:12

Mehmed

1 Answers

Solution 1 (simply resizing the input image) is a common approach. Unless you have a very different aspect ratio from the expected input shape (or your target classes have tight geometric constraints), you can usually still get good performance.

As you mentioned, Solution 2 (cropping your image) has the drawback of potentially excluding a critical part of your image. You can get around that by running the classification on multiple subwindows of the original image (i.e., classify multiple 100 x 100 sub-images by stepping over the input image horizontally and/or vertically at an appropriate stride). Then, you need to decide how to combine your multiple classification results.

Solution 3 will not work because the convolutional network needs to know the image dimensions (otherwise, it wouldn't know which pixels are horizontally and vertically adjacent). So you need to pass an image with explicit dimensions (e.g., 100 x 100) unless the network expects an array that was flattened from assumed dimensions. But if you simply pass an array of 10000 pixel values and the network doesn't know (or can't assume) whether the image was 100 x 100, 50 x 200, or 250 x 40, then the network can't apply the convolutional filters properly.

Solution 1 is clearly the easiest to implement but you need to balance the likely effect of changing the image aspect ratios with the level of effort required for running and combining multiple classifications for each image.

139

answered Oct 14 '22 02:10

bogatron

Related questions
                            
                                Can't resize image using CSS
                            
                                How can I fix this YCrCb -> RBG conversion formula?
                            
                                Scale images as a single surface in Java 2D API
                            
                                Is there a view controller for image crop and rotate works like iOS 8 photo.app?
                            
                                How to understand the average pixel number described in the frequency image?
                            
                                <img> unwanted photo rotation
                            
                                Difference between bitmap and bitmapdata
                            
                                iOS app submission error : ERROR ITMS-90032: "Invalid Image Path - No image found at the path referenced under key 'CFBundleIcons': 'AppIcon29x29'"
                            
                                De-skew characters in binary image
                            
                                Detecting circles and shots from paper target
                            
                                Images not visible in search results
                            
                                R raster plotting an image, draw a circle and mask pixels outside circle
                            
                                Extract a patch from an image given patch center and patch scale
                            
                                Custom drawable with ImageView
                            
                                How to add custom EXIF tags to a image
                            
                                Blur an image using java.util.concurrent, however, the resulting image is entirely black
                            
                                Java: save grayscale int array to image
                            
                                Taking a Base64 Encoded Image and Sending it with ExpressJS as an Image
                            
                                How to set the <img> tag with basic authentication
                            
                                Image processing library for Android and Java

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Image resizing method during preprocessing for neural network

Tags:

image

machine-learning

neural-network

classification

conv-neural-network

Mehmed

People also ask

1 Answers

bogatron

Recent Activity

Donate For Us