Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Equal arrays but not the same visually

enter image description here

I have a 32x32x3 image, say for example one of the cifar10 images in keras. Now, say I want to do some manipulation. First, to make sure I am doing things right, I was trying to copy the image (that is not I want to do, so please don't tell me how to copy the image without doing three loops, I need the three loops to manipulate some values).

from keras.datasets import cifar10
import matplotlib.pyplot as plt

(X_train, Y_train), (X_test, Y_test) = cifar10.load_data()
im = numpy.reshape(X_train[0], (3, 32, 32))
im = im.transpose(1,2,0)
imC = numpy.zeros((32,32,3))

for k in range(3):
  for row in range(0,32):
    for col in range(0,32):
      imC[row][col][k] = im[row][col][k]

Now, if I test if they are the same, they are, in fact I see "cool" printed out

if (im==imC).all():
  print "cool"

But when I try to visualize them, they are different:

plt.imshow( imC )
plt.show()

plt.imshow( im )
plt.show()

What is going on?

like image 439
user Avatar asked May 23 '16 14:05

user


1 Answers

The images in the Python CIFAR10 dataset have pixel values of type numpy.uint8. (Presumably they are read from PNG files or something of the kind.) So X_train.dtype == numpy.uint8 and hence im.dtype == numpy.uint8.

The array you create has the default element type of numpy.float64. In other words, imC.dtype == numpy.uint8.

It happens that matplotlib.pyplot.imshow treats its input differently depending on its element type. In particular, if you give it an m-by-n-by-3 array of element type uint8 it will take 0 to mean darkest and 255 to mean lightest for each of the three colour channels, as you would expect; if you give it an m-by-n-by-3 array of element type float64, though, it wants all the values to be in the range 0 (darkest) to 1 (lightest), and the documentation says nothing about what will happen to values outside that range.

I will hazard a guess at what does happen to values outside that range: I think the code probably does something like: multiply by 255, round to integer, treat as uint8. This means at 0 becomes 0 and 1 becomes 255.

But if that last step means throwing away all but the low 8 bits, it also means that 2 becomes 254, 3 becomes 253, ..., 255 becomes 1! In other words, if you make the very understandable mistake of giving imshow an image whose pixel values are floats in the range 0..255, those values will effectively be negated so that 0->0, 1->255, 2->254,...,255->1. (This isn't quite the same as turning the range exactly upside down, because 0 is preserved.)

And this is what's happened to you: each element of imC is numerically equal to the corresponding element of im, but because imC is a float array rather than an unsigned-small-integer array it gets the treatment described above, and you get almost a photographic negative of the image you expected.

like image 131
Gareth McCaughan Avatar answered Nov 01 '22 23:11

Gareth McCaughan