Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why the dip in speed increase for generating 400,000,000 random numbers?

I'm generating around 400,000,000 (400 million) random numbers in parallel on an Intel i7 with 4 cores (8 threads hyperthreaded) on macOS with 8 GB RAM.

However, I'm also generating 400,000,000 random numbers on a DigitalOcean server with 20 cores on Debian with 64 GB RAM.

Here's the code:

import multiprocessing
import random

rangemin = 1
rangemax = 9

def randomGenPar_backend(backinput):
    return random.randint(rangemin, rangemax)

def randomGenPar(num):
    pool = multiprocessing.Pool()
    return pool.map(randomGenPar_backend, range(0, num))

randNum = 400000000

random.seed(999)
randomGenPar(randNum)

These are the results of the benchmark:

5,000,000 Random Numbers:
1 Core: 5.984
8 Core: 1.982

50,000,000 Random Numbers:
1 Core: 57.28
8 Core: 19.799
20 Core: 18.257
Times Benefit (20 core vs. 8 core) = 1.08

100,000,000 Random Numbers:
1 Core: 115
8 Core: 40.434
20 Core: 31.652
Times Benefit (20 core vs. 8 core) = 1.28

200,000,000 Random Numbers:
8 Core: 87
20 Core: 60
Times Benefit (20 core vs. 8 core) = 1.45

300,000,000 Random Numbers:
8 Core: 157
20 Core: 88
Times Benefit (20 core vs. 8 core) = 1.78

400,000,000 Random Numbers:
8 Core: 202
20 Core: 139
Times Benefit (20 core vs. 8 core) = 1.45 (DIP!)

500,000,000 Random Numbers:
8 Core: 280
20 Core: 171
Times Benefit (20 core vs. 8 core) = 1.64 (INCREASE!)

600,000,000 Random Numbers:
8 Core: 342
20 Core: 198
Times Benefit (20 core vs. 8 core) = 1.73

700,000,000 Random Numbers:
8 Core: 410
20 Core: 206
Times Benefit (20 core vs. 8 core) = 1.99

800,000,000 Random Numbers:
8 Core: 482
20 Core: 231
Times Benefit (20 core vs. 8 core) = 2.09

Usually, the more random numbers that are generated, the more the parallelism of the 20 core CPU can be used. Therefore, the "times increase" of speed from 8 core to 20 core increases over time.

However, after 300 million random numbers, this decreases, and increases again until 800 million (I haven't tested further).

Why is this? Is there a specific reason? Was it just random? (I've repeated this twice, and gotten the same result both times)

EDIT: If it makes any difference, I'm using the time function to time the execution of the script. Also, the OS isn't the same on both machines (8 core - macOS, 20 core - Debian).

like image 325
TajyMany Avatar asked Jun 12 '17 09:06

TajyMany


People also ask

What is the importance of random numbers in physics?

Random numbers are used to model timings and behaviour of event. Random numbers are important constituent of mathematical modelling. Random numbers are also used in simulation of discrete system. Please log in to add an answer.

Are random numbers more efficient than natural distribution?

As Dr Smith observes, it would be more efficient if the random numbers used corresponded to the natural distribution in the first place. There is also an abundance problem. Finding random phenomena in nature that can be transformed into computer bits is not easy.

What is the use of random numbers in simulation model?

Random numbers can be given as input to some simulation model to test that model. By giving random numbers to model we can find out at which input our simulation model fails to calculate proper result in short it can be used for testing the simulation model. Random numbers are used to model timings and behaviour of event.

How do you make random numbers?

Another member of the project, Shashank Misra, says COINFLIPS’ researchers have identified two hardware-based approaches for the production of tuneable, abundant random numbers. One relies on the patterns magnetic films make when disturbed, the other on how electrons travel through the barrier of a quantum-tunnelling diode.


1 Answers

Two possible explanations come to mind.

This could be an artifact of garbage collection kicking in. An easy experiment would be to shut-off GC and see if the "dip" persists:

>>> import gc
>>> gc.disable()

Another possibility is that this is an artifact of list growth using realloc() under-the-hood. Lists implemented are fixed length arrays of pointers. When map() grows the list with append(), the C function call realloc() is called periodically to resize the array of pointers. Often, this call is very cheap because none of the data has to be moved. However, if even a single byte in memory "obstructs" the resize, then all of the data has to be relocated. This is very expensive and could cause your the "dip" if at that point in execution multiprocessing is creating an obstructing byte.

To test this hypothesis, you could use imap() instead of map() and feed the results into collections.deque() instead of a list(). The deque implementation does not use relloc so its performance is consistent in the face of fragmented memory (internally, it just makes repeated calls to malloc() to obtained fixed length memory blocks).

like image 176
Raymond Hettinger Avatar answered Oct 16 '22 04:10

Raymond Hettinger