Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reasons for differences in memory consumption and performances of np.zeros and np.full

When measuring memory consumption of np.zeros:

import psutil
import numpy as np

process = psutil.Process()
N=10**8
start_rss = process.memory_info().rss
a = np.zeros(N, dtype=np.float64)
print("memory for a", process.memory_info().rss - start_rss)

the result is unexpected 8192 bytes, i.e almost 0, while 1e8 doubles would need 8e8 bytes.

When replacing np.zeros(N, dtype=np.float64) by np.full(N, 0.0, dtype=np.float64) the memory needed for a are 800002048 bytes.

There are similar discrepancies in running times:

import numpy as np
N=10**8
%timeit np.zeros(N, dtype=np.float64)
# 11.8 ms ± 389 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit np.full(N, 0.0, dtype=np.float64)
# 419 ms ± 7.69 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

I.e. np.zeros is up to 40 times faster for big sizes.

Not sure these differences are for all architectures/operating systems, but I've observed it at least for x86-64 Windows and Linux.

Which differences between np.zeros and np.full can explain different memory consumption and different running times?

like image 581
ead Avatar asked Mar 11 '20 16:03

ead


People also ask

What is the difference between NP and P problems in Computer Science?

Therefore, A problem is called NP if its solution can be guessed and verified in polynomial time, and nondeterministic means that no particular rule is followed to make the guess. On the other hand, a P problem is one that can be solved in polynomial time by deterministic algorithms.

How to use zeros() function in NumPy?

The np.zeros () is a numpy library function used to return an array of similar shape and size with values of elements of the array as zeros. The zeros () function takes three arguments and returns the array filled with zero values. The zeros () method is defined under NumPy, imported as import numpy as np.

What is the difference between P algorithm and NP algorithm?

The difference between these two can be huge. If a P algorithm has 100 elements, and its time to complete working is proportional to N3, then it will solve its problem in about 3 hours. If it’s an NP algorithm, however, and its completion time is proportional to 2 N, then it will take roughly 300 quintillion years.

How to reduce memory usage of NumPy arrays?

If you’re running into memory issues because your NumPy arrays are too large, one of the basic approaches to reducing memory usage is compression. By changing how you represent your data, you can reduce memory usage and shrink your array’s footprint—often without changing the bulk of your code. In this article we’ll cover:


2 Answers

I don't trust psutil for these memory benchmarks, and rss (Resident Set Size) may not be the right metric in the first place.

Using stdlib tracemalloc you can get correct looking numbers for memory allocation - it should be approx an 800000000 bytes delta for this N and float64 dtype:

>>> import numpy as np
>>> import tracemalloc
>>> N = 10**8
>>> tracemalloc.start()
>>> tracemalloc.get_traced_memory()  # current, peak
(159008, 1874350)
>>> a = np.zeros(N, dtype=np.float64)
>>> tracemalloc.get_traced_memory()
(800336637, 802014880)

For the timing differences between np.full and np.zeros, compare the man pages for malloc and calloc, i.e. the np.zeros is able to go to an allocation routine which gets zeroed pages. See PyArray_Zeros --> calls PyArray_NewFromDescr_int passing in 1 for the zeroed argument, which then has a special case for allocating zeros faster:

if (zeroed || PyDataType_FLAGCHK(descr, NPY_NEEDS_INIT)) {
    data = npy_alloc_cache_zero(nbytes);
}
else {
    data = npy_alloc_cache(nbytes);
}

It looks like np.full does not have this fast path. There the performance will be similar to first doing an init and then doing a copy O(n):

a = np.empty(N, dtype=np.float64)
a[:] = np.float64(0.0)

numpy devs could presumably have added a fast path to np.full if the fill value was zero, but why bother to add another way to do the same thing - users could just use np.zeros in the first place.

like image 128
wim Avatar answered Sep 20 '22 13:09

wim


The numpy.zeros function straight uses the C code layer of the Numpy library while the functions ones and full works as same by initializing an array of values and copying the desired value in it.

Then the zeros function doesn't need any language interpretation while for the others, ones and full, the Python code need to be interpreted as C code.

Have a look on the source code to figure it out by yourself: https://github.com/numpy/numpy/blob/master/numpy/core/numeric.py

like image 24
Laurent GRENIER Avatar answered Sep 18 '22 13:09

Laurent GRENIER