Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which operator (+ vs +=) should be used for performance? (In-place Vs not-in-place)

Let's say I have this two snippet of code in python :

1 --------------------------
import numpy as np
x = np.array([1,2,3,4])
y = x
x = x + np.array([1,1,1,1])
print y

2 --------------------------
import numpy as np
x = np.array([1,2,3,4])
y = x
x += np.array([1,1,1,1])
print y

I thought the result of y will be the same in both examples since y point out to x and x become (2,3,4,5), BUT it wasn't

The results were (1,2,3,4) for 1 and (2,3,4,5) for 2.

After some research I find out that in first example

#-First example---------------------------------------
x = np.array([1,2,3,4]) # create x --> [1,2,3,4] 
y = x                   # made y point to x
# unril now we have x --> [1,2,3,4]
#                          |
#                          y
x = x + np.array([1,1,1,1]) 
# however this operation **create a new array** [2,3,4,5] 
# and made x point to it instead of the first one
# so we have y --> [1,2,3,4] and x --> [2,3,4,5]

#-Second example--------------------------------------
x = np.array([1,2,3,4]) # create x --> [1,2,3,4] 
y = x                   # made y point to x
# unril now the same x --> [1,2,3,4]
#                            |
#                            y
x += np.array([1,1,1,1])
# this operation **Modify the existing array**
# so the result will be
# unril now the same x --> [2,3,4,5]
#                            |
#                            y

You can find out more about this behaviors (not only for this example) in this link In-place algorithm

My question is : Being aware of this behavior why should I use in-place algorithm in term of performance? (time of excution faster? less memory alocation?..)

EDIT : Clarification

The example of (+, =+) was just to explain simply the in-place algorithm to the one who don't know.. but the question was in general the use of in-place algorithm not only in this case..

As another more complex example: loading a CSV file (just 10 Million rows) in a variable then sorting the result, is the idea of in-place algorithm is to produce an output in the same memory space that contains the input by successively transforming that data until the output is produced? - This avoids the need to use twice the storage - one area for the input and an equal-sized area for the output ( Using the minimum amount of RAM, hard disk ... )

like image 378
AnouarZ Avatar asked Nov 15 '17 12:11

AnouarZ


1 Answers

x = x + 1 vs x += 1

Performance

It seems that you understand the semantical difference between x += 1 and x = x + 1.

For benchmarking, you can use timeit in IPython.

After defining those functions:

import numpy as np
def in_place(n):
    x = np.arange(n)
    x += 1

def not_in_place(n):
    x = np.arange(n)
    x = x + 1

def in_place_no_broadcast(n):
    x = np.arange(n)
    x += np.ones(n, dtype=np.int)

You can simply use the %timeit syntax to compare performances:

%timeit in_place(10**7)
20.3 ms ± 81.4 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit not_in_place(10**7)
30.4 ms ± 253 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit in_place_no_broadcast(10**7)
35.4 ms ± 101 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

not_in_place is 50% slower than in_place.

Note that broadcasting also makes a huge difference : numpy understands x += 1 as adding a 1 to every single element of x, without having to create yet another array.

Warning

in_place should be the preferred function: it's faster and uses less memory. You might run into bugs if you use and mutate this object at different places in your code, though. The typical example would be :

x = np.arange(5)
y = [x, x]
y[0][0] = 10
y
# [array([10,  1,  2,  3,  4]), array([10,  1,  2,  3,  4])]

Sorting

Your understanding of the advantages of in-place sorting is correct. It can make a huge difference in memory requirements when sorting large data sets.

There are other desirable features for a sorting algorithm (stable, acceptable worst-case complexity, ...) and it looks like the standard Python algorithm (Timsort) has many of them.

Timsort is an hybrid algorithm. Some parts of it are in-place and some require extra memory. It will never use more than n/2 though.

like image 198
Eric Duminil Avatar answered Oct 06 '22 22:10

Eric Duminil