When measuring memory consumption of np.zeros
:
import psutil
import numpy as np
process = psutil.Process()
N=10**8
start_rss = process.memory_info().rss
a = np.zeros(N, dtype=np.float64)
print("memory for a", process.memory_info().rss - start_rss)
the result is unexpected 8192
bytes, i.e almost 0, while 1e8 doubles would need 8e8 bytes.
When replacing np.zeros(N, dtype=np.float64)
by np.full(N, 0.0, dtype=np.float64)
the memory needed for a
are 800002048
bytes.
There are similar discrepancies in running times:
import numpy as np
N=10**8
%timeit np.zeros(N, dtype=np.float64)
# 11.8 ms ± 389 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit np.full(N, 0.0, dtype=np.float64)
# 419 ms ± 7.69 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
I.e. np.zeros
is up to 40 times faster for big sizes.
Not sure these differences are for all architectures/operating systems, but I've observed it at least for x86-64 Windows and Linux.
Which differences between np.zeros
and np.full
can explain different memory consumption and different running times?
Therefore, A problem is called NP if its solution can be guessed and verified in polynomial time, and nondeterministic means that no particular rule is followed to make the guess. On the other hand, a P problem is one that can be solved in polynomial time by deterministic algorithms.
The np.zeros () is a numpy library function used to return an array of similar shape and size with values of elements of the array as zeros. The zeros () function takes three arguments and returns the array filled with zero values. The zeros () method is defined under NumPy, imported as import numpy as np.
The difference between these two can be huge. If a P algorithm has 100 elements, and its time to complete working is proportional to N3, then it will solve its problem in about 3 hours. If it’s an NP algorithm, however, and its completion time is proportional to 2 N, then it will take roughly 300 quintillion years.
If you’re running into memory issues because your NumPy arrays are too large, one of the basic approaches to reducing memory usage is compression. By changing how you represent your data, you can reduce memory usage and shrink your array’s footprint—often without changing the bulk of your code. In this article we’ll cover:
I don't trust psutil
for these memory benchmarks, and rss (Resident Set Size) may not be the right metric in the first place.
Using stdlib tracemalloc
you can get correct looking numbers for memory allocation - it should be approx an 800000000 bytes delta for this N and float64 dtype:
>>> import numpy as np
>>> import tracemalloc
>>> N = 10**8
>>> tracemalloc.start()
>>> tracemalloc.get_traced_memory() # current, peak
(159008, 1874350)
>>> a = np.zeros(N, dtype=np.float64)
>>> tracemalloc.get_traced_memory()
(800336637, 802014880)
For the timing differences between np.full
and np.zeros
, compare the man pages for malloc
and calloc
, i.e. the np.zeros
is able to go to an allocation routine which gets zeroed pages. See PyArray_Zeros
--> calls PyArray_NewFromDescr_int
passing in 1
for the zeroed
argument, which then has a special case for allocating zeros faster:
if (zeroed || PyDataType_FLAGCHK(descr, NPY_NEEDS_INIT)) {
data = npy_alloc_cache_zero(nbytes);
}
else {
data = npy_alloc_cache(nbytes);
}
It looks like np.full
does not have this fast path. There the performance will be similar to first doing an init and then doing a copy O(n):
a = np.empty(N, dtype=np.float64)
a[:] = np.float64(0.0)
numpy
devs could presumably have added a fast path to np.full
if the fill value was zero, but why bother to add another way to do the same thing - users could just use np.zeros
in the first place.
The numpy.zeros function straight uses the C code layer of the Numpy library while the functions ones and full works as same by initializing an array of values and copying the desired value in it.
Then the zeros function doesn't need any language interpretation while for the others, ones and full, the Python code need to be interpreted as C code.
Have a look on the source code to figure it out by yourself: https://github.com/numpy/numpy/blob/master/numpy/core/numeric.py
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With