Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between numpy.shares_memory and numpy.may_share_memory?

Tags:

python

numpy

Why does numpy.may_share_memory exist?
What is the challenge to give an exact result?

Is numpy.may_share_memory deprecated method?
numpy.may_share_memory may give false positives, but it does not give false negatives.

Does numpy.shares_memory give no one false positive and no one false negative?

I use numpy version 1.11.2.

See:

  1. numpy.may_share_memory
  2. numpy.shares_memory
  3. version 1.11.2 source on github
like image 535
glegoux Avatar asked Jul 01 '17 20:07

glegoux


2 Answers

Quoting the release notes for 1.11.0:

A new function np.shares_memory that can check exactly whether two arrays have memory overlap is added. np.may_share_memory also now has an option to spend more effort to reduce false positives.

Semantically, this suggests that the older may_share_memory test was designed to get a loose guess whether memory is shared between the arrays. If surely not, then one could proceed accordingly. If there was a positive test (possibly a false positive), care had to be taken. The new shares_memory function, on the other hand, allows exact checks. This takes more computational time, but can be beneficial in the long run, since free of false positives one can use more possible optimizations. The looser check of may_share_memory probably only guarantees to not return false negatives.

In terms of the documentation of may_share_memory and shares_memory, both have a keyword argument that tells numpy how strict a check the user wants.

may_share_memory:

max_work : int, optional

    Effort to spend on solving the overlap problem. See shares_memory for details. Default for may_share_memory is to do a bounds check.

shares_memory:

max_work : int, optional

    Effort to spend on solving the overlap problem (maximum number of candidate solutions to consider). The following special values are recognized:

    max_work=MAY_SHARE_EXACT (default)

        The problem is solved exactly. In this case, the function returns True only if there is an element shared between the arrays.
    max_work=MAY_SHARE_BOUNDS

        Only the memory bounds of a and b are checked.

Judging by the docs, this suggests that the two functions might call the same underlying machinery, but may_share_memory uses a less strict default setting for the check.

Let's take a peek at the implementation:

static PyObject *
array_shares_memory(PyObject *NPY_UNUSED(ignored), PyObject *args, PyObject *kwds)
{
    return array_shares_memory_impl(args, kwds, NPY_MAY_SHARE_EXACT, 1);
}


static PyObject *
array_may_share_memory(PyObject *NPY_UNUSED(ignored), PyObject *args, PyObject *kwds)
{
    return array_shares_memory_impl(args, kwds, NPY_MAY_SHARE_BOUNDS, 0);
}

calling the same underlying function with signature

static PyObject *
array_shares_memory_impl(PyObject *args, PyObject *kwds, Py_ssize_t default_max_work,
                         int raise_exceptions)
{}

Without delving deeper into the source, it seems to me that shares_memory is an improvement over may_share_memory, which can give the same loose check as the latter with the appropriate keyword arguments. The older function can be used for convenience and backward compatibility.

Disclaimer: this is the first time I looked at this part of the source, and I didn't investigate further into array_shares_memory_impl, so my impression can be simply wrong.


As for a specific example for the difference between the two methods (called with default arguments): it is explained at the above links that may_share_memory only checks array bound indices. If they are disjoint for the two arrays, then there's no chance that they can share memory. But if they are not disjoint, the arrays can still be independent!

Simple example: a disjoint partitioning of a contiguous block of memory via slicing:

>>> import numpy as np
>>> v = np.arange(6)
>>> x = v[::2]
>>> y = v[1::2]
>>> np.may_share_memory(x,y)
True
>>> np.shares_memory(x,y)
False
>>> np.may_share_memory(x,y,max_work=np.MAY_SHARE_EXACT)
False

As you can see, x and y are two disjoint slices of the same array. Thus their data ranges largely overlap (they are almost the same, save a single integer in memory). However, none of their elements are actually the same: one contains the even, the other the odd elements of the original contiguous block. So may_share_memory correctly asserts that the arrays may share memory, but on a stricter check it turns out that they don't.


As for the additional difficulty of computing the overlap exactly, the work can be traced down to the worker called solve_may_share_memory, which also contains a lot of helpful comments about what's going on. In a nutshell, there's

  1. a quick check and return if the bounds don't overlap, otherwise
  2. a return with MEM_OVERLAP_TOO_HARD if we asked for loose checking (i.e. may_share_memory with default args), which is handled on the calling side as "we don't know, so return True"
  3. otherwise we actually solve the Diophantine equations that the problem maps to starting here

So the work in point 3 above is what needs to additionally be done by shares_memory (or generally, a strict checking case).

like image 95

Before reading the following, read :

http://scipy-cookbook.readthedocs.io/items/ViewsVsCopies.html
http://www.scipy-lectures.org/advanced/advanced_numpy/

In fact the problem is to find memory overlap for two strided arrays a and b:

See Implementatation NumPy (It is important to read the comment in the header).

This problem is equivalent to :

"Find solution to a bounded Diophantine equations with positive coefficients"

Take an example for 1D-array:

import numpy as np

x = np.arange(8, dtype=np.int8)
a = x[::3]
b = x[1::2]

In memory we have:

1D array

1D array is a contiguous structure in memory. I suppose that our memory has 64 bits addresses (on 8 bytes), and each element our array has a size of one byte (0 <= np.int8 < 256).

To resolve the overlap issue, the possible memory addresses of one element of a are :

base_a + stride_a * x_a where x_a is a variable (array index 0-based).

And we have the same for b :

base_b + stride_b * x_b where x_b is a variable (array index 0-based).

There is overlap if and only if :

base_a + stride_a * x_a = base_b + stride_b * x_b

We have :

stride_a * x_a - stride_b * x_b = base_b - base_a

with 0 <= x_a < shape_a and 0 <= x_b < shape_b.

We can transform all negative coefficients, instead of reading b top to bottom, we can read that from bottom to top by a variable change :

x_b' = shape_b - 1 - x_b

We obtain:

stride_a * x_a + stride_b * x_b = base_b + stride_b * (shape_b - 1) - base_a

Here :

3 x_a + 2 x_b = 7 (= 1 + 2 * (4 - 1))

with 0 <= x_a < 3 and 0 <= x_b < 4.

One solution is x_a = 1 and x_b = 2 (read from bottom for x_b).

....

We can easily generalize that for 2D array then XD array, and of each array element takes more than one byte (for example 4 bytes, all array elements are necessary a common size in memory).

Here a naive solution on my github and comparaison performance with NumPy implementation.

...

like image 33
glegoux Avatar answered Oct 24 '22 13:10

glegoux