Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create Numpy array of images

I have some (950) 150x150x3 .jpg image files that I want to read into an Numpy array.

Following is my code:

X_data = []
files = glob.glob ("*.jpg")
for myFile in files:
    image = cv2.imread (myFile)
    X_data.append (image)

print('X_data shape:', np.array(X_data).shape)

The output is (950, 150). Please let me know why the list is not getting converted to np.array correctly and whether there is a better way to create the array of images.

Of what I have read, appending to numpy arrays is easier done through python lists and then converting them to arrays.

EDIT: Some more information (if it helps), image.shape returns (150,150,3) correctly.

like image 868
Abhishek Bansal Avatar asked Jun 10 '16 11:06

Abhishek Bansal


People also ask

How do I create an array of images?

Simple Array of Images The simplest, most straightforward way to create an array of images is to declare an array object and push the URLs to your images to it. // Declare an array object for our array of images let arrayOfImages = []; // Push the URLs to three images to arrayOfImages arrayOfImages.

Can NumPy array store images?

fromarray() Function to Save a NumPy Array as an Image. The fromarray() function is used to create an image memory from an object which exports the array. We can then save this image memory to our desired location by providing the required path and the file name.

How to convert NumPy array to image in Python?

Approach: 1 Create a numpy array. 2 Reshape the above array to suitable dimensions. 3 Create an image object from the above array using PIL library. 4 Save the image object in a suitable file format.

How do I create a ndarray in NumPy?

Create a NumPy ndarray Object NumPy is used to work with arrays. The array object in NumPy is called ndarray. We can create a NumPy ndarray object by using the array () function.

How to get the value of each pixel in a NumPy array?

This function converts the input to an array By using numpy.array () function which takes an image as the argument and converts to NumPy array In order to get the value of each pixel of the NumPy array image, we need to print the retrieved data that got either from asarray () function or array () function.

How to save an image dataset in NumPy?

To save the image dataset which we create to the working directory we will use the save_npy_dataset () method. Let’s examine how the image dataset we created looks like by restoring it. We can use the NumPy load method to restore the dataset.


3 Answers

I tested your code. It works fine for me with output

('X_data shape:', (4, 617, 1021, 3))

however, all images were exactly the same dimension.

When I add another image with different extents I have this output:

('X_data shape:', (5,))

So I'd recommend checking the sizes and the same number of channels (as in are really all images coloured images)? Also you should check if either all images (or none) have alpha channels (see @Gughan Ravikumar's comment)

If only the number of channels vary (i.e. some images are grey), then force loading all into the color format with:

image = cv2.imread (myFile, cv2.IMREAD_COLOR)

EDIT: I used the very code from the question, only replaced with a directory of mine (and "*.PNG"):

import cv2
import glob
import numpy as np

X_data = []
files = glob.glob ("C:/Users/xxx/Desktop/asdf/*.PNG")
for myFile in files:
    print(myFile)
    image = cv2.imread (myFile)
    X_data.append (image)

print('X_data shape:', np.array(X_data).shape)
like image 83
DomTomCat Avatar answered Oct 09 '22 00:10

DomTomCat


Appending images in a list and then converting it into a numpy array, is not working for me. I have a large dataset and RAM gets crashed every time I attempt it. Rather I append the numpy array, but this has its own cons. Appending into list and then converting into np array is space complex, but appending a numpy array is time complex. If you are patient enough, this will take care of RAM crasing problems.

def imagetensor(imagedir):
  for i, im in tqdm(enumerate(os.listdir(imagedir))):
    image= Image.open(im)
    image= image.convert('HSV')
    if i == 0:
      images= np.expand_dims(np.array(image, dtype= float)/255, axis= 0)
    else:
      image= np.expand_dims(np.array(image, dtype= float)/255, axis= 0)
      images= np.append(images, image, axis= 0)
  return images

I am looking for better implementations that can take care of both space and time. Please comment if someone has a better idea.

like image 23
Mridul Pandey Avatar answered Oct 09 '22 00:10

Mridul Pandey


Here is a solution for images that have certain special Unicode characters, or if we are working with PNGs with a transparency layer, which are two cases that I had to handle with my dataset. In addition, if there are any images that aren't of the desired resolution, they will not be added to the Numpy array. This uses the Pillow package instead of cv2.

resolution = 150

import glob
import numpy as np
from PIL import Image

X_data = []
files = glob.glob(r"D:\Pictures\*.png")
for my_file in files:
    print(my_file)
    
    image = Image.open(my_file).convert('RGB')
    image = np.array(image)
    if image is None or image.shape != (resolution, resolution, 3):
        print(f'This image is bad: {myFile} {image.shape if image is not None else "None"}')
    else:
        X_data.append(image)

print('X_data shape:', np.array(X_data).shape)
# If you have 950 150x150 images, this would print 'X_data shape: (950, 150, 150, 3)'

If you aren't using Python 3.6+, you can replace the r-string with a regular string (except with \\ instead of \, if you're using Windows), and the f-string with regular string interpolation.

like image 1
mic Avatar answered Oct 09 '22 02:10

mic