Python

Question

I am trying to read multiple rgb images into one matrix, such that the matrix dimensions are (image_size, image_size, index) e.g. data[:,:,1] should retrieve the 1st image.

data = np.zeros((image_dim, image_dim, numImages), dtype=np.float64)
for fname in os.listdir('images/sample_images/'):
       name='....'
       image=mpimg.imread(name)
       data = np.append(data, image)
return data

image.shape = (512, 512, 3) data.shape = (512, 512, 100)

Apart from the fact that np.append leaves me with an empty data array, is there another way of appending the image-array values to a big data matrix?

Thanks in advance

rayryeng · Accepted Answer

Falko's post is certainly the canonical way to do it. However, if I can suggest a more numpy / Pythonic way to do it, I would let the first dimension be the index of which image you want, while the second and third dimensions be the rows and columns of the image, and optionally the fourth dimension being the colour channel you want. Therefore, supposing that your image has dimensions M x N and you had K images, you would create a matrix that is K x M x N long or K x M x N x 3 long in the case of colour images.

As such, a simple one-liner in numpy could be this given your current code:

data = np.array([mpimg.imread(name) for name in os.listdir('images/sample_images/')], dtype=np.float64)

As such, if you want to access the i^th image, you would simply do data[i]. This will work independently of whether the image is RGB or grayscale... so by doing data[i], you'll get an RGB image or a grayscale image, depending on what you decided to use to pack the array. However, you need to make sure that all of the images are consistent... That is, they're all colour or all grayscale.

However, to show you that this works, let's try this with 5 x 5 x 3 "RGB" images where each starts from 0 and increases up to K-1 where K in this case will be 10:

data = np.array([i*np.ones((5,5,3)) for i in range(10)], dtype=np.float64)

Let's see a sample run (in IPython):

In [26]: data = np.array([i*np.ones((5,5,3)) for i in range(10)], dtype=np.float64)

In [27]: data.shape
Out[27]: (10, 5, 5, 3)

In [28]: img = data[0]

In [29]: img
Out[29]: 
array([[[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]],

       [[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]],

       [[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]],

       [[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]],

       [[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]]])

In [30]: img.shape
Out[30]: (5, 5, 3)

In [31]: img = data[7]

In [32]: img
Out[32]: 
array([[[ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.]],

       [[ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.]],

       [[ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.]],

       [[ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.]],

       [[ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.]]])

In [33]: img.shape
Out[33]: (5, 5, 3)

In the above sample run, I created the sample data array and it's 10 x 5 x 5 x 3 as we expected. We have 10 5 x 5 x 3 matrices. Next, I extract out the first "RGB" image and it's all 0s as we expect, with a size of 5 x 5 x 3. I also extract out the eighth slice and we all get 7s as we expect, with a size of 5 x 5 x 3.

Obviously, choose whichever answer you think is best, but I personally would go with the above route as indexing into your array to grab the right image is simpler - you're letting dimension broadcasting do the work for you.

Falko · Answer

You better use dstack for stacking arrays in the 3rd dimension:

data = np.zeros((3, 3, 0))
for i in range(5):
    image = np.random.rand(3, 3, 1)
    data = np.dstack((data, image))
print data.shape

Output:

(3, 3, 5)

Note: Here I assume that each (random) image has one channel. If you have RGB images, you'd end up with 3 times the number of resulting channels, i.e. shape (3, 3, 15).

Python - read images into image matrix

Tags:

arrays

image-processing

matrix

jenpaff

2 Answers

rayryeng

Falko

Recent Activity

Donate For Us