Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does NumPy's random function seemingly display a pattern in its generated values?

I was playing around with NumPy and Pillow and came across an interesting result that apparently showcases a pattern in NumPy random.random() results.

Image One Image Two Image Three Image Four

Here a sample of the full code for generating and saving 100 of these images (with seed 0), the above are the first four images generated by this code.

import numpy as np
from PIL import Image

np.random.seed(0)
img_arrays = np.random.random((100, 256, 256, 3)) * 255
for i, img_array in enumerate(img_arrays):
    img = Image.fromarray(img_array, "RGB")
    img.save("{}.png".format(i))

The above are four different images created using PIL.Image.fromarray() on four different NumPy arrays created using numpy.random.random((256, 256, 3)) * 255 to generate a 256 by 256 grid of RGB values in four different Python instances (the same thing also happens in the same instance).

I noticed that this only happens (in my limited testing) when the width and height of the image is a power of two, I am not sure how to interpret that.

Although it may be hard to see due to browser anti-aliasing (you can download the images and view them in image viewers with no anti-aliasing), there are clear purple-brown columns of pixels every 8th column starting from the 3rd column of every image. To make sure, I tested this on 100 different images and they all followed this pattern.

What is going on here? I am guessing that patterns like this are the reason that people always say to use cryptographically secure random number generators when true randomness is required, but is there a concrete explanation behind why this is happening in particular?

like image 615
Ziyad Edher Avatar asked May 11 '18 13:05

Ziyad Edher


People also ask

Why is NumPy's random module not considered completely random?

The “random” numbers generated by NumPy are not exactly random. They are pseudo-random … they approximate random numbers, but are 100% determined by the input and the pseudo-random number algorithm.

What does Python function random random () return?

Python Random random() Method The random() method returns a random floating number between 0 and 1.

What does print random random ()) do?

random() function generates random floating numbers in the range[0.1, 1.0). (See the opening and closing brackets, it means including 0 but excluding 1). It takes no parameters and returns values uniformly distributed between 0 and 1.

Is Python random function really random?

Most random data generated with Python is not fully random in the scientific sense of the word. Rather, it is pseudorandom: generated with a pseudorandom number generator (PRNG), which is essentially any algorithm for generating seemingly random but still reproducible data.


2 Answers

Don't blame Numpy, blame PIL / Pillow. ;) You're generating floats, but PIL expects integers, and its float to int conversion is not doing what we want. Further research is required to determine exactly what PIL is doing...

In the mean time, you can get rid of those lines by explicitly converting your values to unsigned 8 bit integers:

img_arrays = (np.random.random((100, 256, 256, 3)) * 255).astype(np.uint8)

As FHTMitchell notes in the comments, a more efficient form is

img_arrays = np.random.randint(0, 256, (100, 256, 256, 3), dtype=np.uint8) 

Here's typical output from that modified code:

random image made using Numpy


The PIL Image.fromarray function has a known bug, as described here. The behaviour you're seeing is probably related to that bug, but I guess it could be an independent one. ;)

FWIW, here are some tests and workarounds I did on the bug mentioned on the linked question.

like image 86
PM 2Ring Avatar answered Sep 21 '22 16:09

PM 2Ring


I'm pretty sure the problem is to do with the dtype, but not for the reasons you think. Here is one with np.random.randint(0, 256, (1, 256, 256, 3), dtype=np.uint32) note the dtype is not np.uint8:

enter image description here

Can you see the pattern ;)? PIL interprets 32 bit (4 byte) values (probably as 4 pixels RGBK) differently from 8 bit values (RGB for one pixel). (See PM 2Ring's answer).

Originally you were passing 64 bit float values, these are going to also are interpreted differently (and probably incorrectly from how you intended).

like image 25
FHTMitchell Avatar answered Sep 18 '22 16:09

FHTMitchell