My expectation was that pypy could be as much as an order of magnitude faster than python, but the results indicate that pypy is in fact slower than expected.
I have two questions:
Resulting timings:
Python 2.7.5
Pypy 2.2.1
Algorithm:
I am generating a list of points in space using a simple algorithm and I'm trying to optimize the algorithm.
def generate(size=32, point=(0, 0, 0), width=32):
"""
generate points in space around a center point with a specific width and
number of divisions (size)
"""
X, Y, Z = point
half = width * 0.5
delta = width
scale = width / size
offset = scale * 0.5
X = X + offset - half
Y = Y + offset - half
Z = Z + offset - half
for x in xrange(size):
x = (x * scale) + X
for y in xrange(size):
y = (y * scale) + Y
for z in xrange(size):
z = (z * scale) + Z
yield (x, y, z)
In optimizing, I started looking at using pypy rather than python. And in comparing the two, I came up with a few different scenarios:
counting using xrange
rsize = 8 # size of region
csize = 32 # size of chunk
number_of_points = rsize ** 3 * csize ** 3
[x for x in xrange(number_of_points)]
counting using xrange with numpy
rsize = 8 # size of region
csize = 32 # size of chunk
number_of_points = rsize ** 3 * csize ** 3
np.array([x for x in xrange(number_of_points)])
running with the algorithm above
rsize = 8 # size of region
csize = 32 # size of chunk
[p
for rp in generate(size=rsize, width=rsize*csize)
for p in generate(size=csize, width=csize, point=rp)]
running the algorithm above with numpy
rsize = 8 # size of region
csize = 32 # size of chunk
np.array([p
for rp in generate(size=rsize, width=rsize*csize)
for p in generate(size=csize, width=csize, point=rp)])
Background:
I am attempting to create a voxel engine, and I want to optimize my algorithm to bring down generation times to a manageable level. While I clearly won't achieve anything close to Java/C++, I'd like to push python (or pypy) as far as I can.
I noticed that list lookups are significantly faster than dictionaries from some earlier timings. And lists are also faster than tuples (unexpected), though tuples generate faster. Numpy has even faster read times than non-numpy. However, numpy creation time can be orders of magnitude slower.
So if reading is most important, there is a clear advantage for using a numpy. If reading and creation are of equal importance, however, then a straight list is likely best. That said, I don't have a clean way of looking at memory usage, but I suspect that lists are far less memory efficient than tuples or numpy. Also, while it's a small difference, I found .get on a dictionary to be slightly faster than using the __ getitem __ call (i.e. dictionary[lookup] vs. dicitonary.get(lookup) )
timings ...
Python 2.7.5
Reading
- Option 1: tuple access... 2045.51 ms
- Option 2: tuple access (again)... 2081.97 ms # sampling effect of cache
- Option 3: list access... 2072.09 ms
- Option 4: dict access... 3436.53 ms
- Option 5: iterable creation... N/A
- Option 6: numpy array... 1752.44 ms
Creation
- Option 1: tuple creation... 690.36 ms
- Option 2: tuple creation (again)... 716.49 ms # sampling effect of cache
- Option 3: list creation... 684.28 ms
- Option 4: dict creation... 1498.94 ms
- Option 5: iterable creation... 0.01 ms
- Option 6: numpy creation... 3514.25 ms
Pypy 2.2.1
Reading
- Option 1: tuple access... 243.34 ms
- Option 2: tuple access (again)... 246.51 ms # sampling effect of cache
- Option 3: list access... 139.65 ms
- Option 4: dict access... 454.65 ms
- Option 5: iterable creation... N/A
- Option 6: numpy array... 21.60 ms
Creation
- Option 1: tuple creation... 1016.27 ms
- Option 2: tuple creation (again)... 1063.50 ms # sampling effect of cache
- Option 3: list creation... 365.98 ms
- Option 4: dict creation... 2258.44 ms
- Option 5: iterable creation... 0.00 ms
- Option 6: numpy creation... 12514.20 ms
In all the examples, a random lookup was generated for random data.
dsize = 10 ** 7 # or 10 million data points
data = [(i, random.random()*dsize)
for i in range(dsize)]
lookup = tuple(int(random.random()*dsize) for i in range(dsize))
The loops were very simple:
for x in lookup:
data_of_specific_type[x]
And the data_of_specific_type was a conversion of data into that type (e.g. tuple(data), list(data), etc.)
Part of the issue is that:
np.array([p
for rp in generate(size=rsize, width=rsize*csize)
for p in generate(size=csize, width=csize, point=rp)])
Does all the work of creating the list
and converting it to an np.array
.
A faster way to do this would be:
arr = np.empty(size)
i = 0
for rp in generate(size=rsize, width=rsize*csize):
for p in generate(size=csize, width=csize, point=rp):
arr[i] = p
i += 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With