Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make numpy overloading of __add__ independent on operand order?

I am facing an issue when overloading operators in a class containing a numpy array as attribute. Depending on the order of the operands, the result type will be my class A (desired behavior) or a numpy array. How to make it always return an instance of A?

Example:

import numpy as np

class A(object):
    """ class overloading a numpy array for addition
    """
    def __init__(self, values):
        self.values = values

    def __add__(self, x):
        """ addition
        """
        x = np.array(x) # make sure input is numpy compatible
        return A(self.values + x)

    def __radd__(self, x):
        """ reversed-order (LHS <-> RHS) addition
        """
        x = np.array(x) # make sure input is numpy compatible
        return A(x + self.values)

    def __array__(self):
        """ so that numpy's array() returns values
        """
        return self.values

    def __repr__(self):
        return "A object: "+repr(self.values)

An instance of A:

>>> a = A(np.arange(5))

This works as expected:

>>> a + np.ones(5)  
A object: array([ 1.,  2.,  3.,  4.,  5.])

This does not:

>>> np.ones(5) + a
array([ 1.,  2.,  3.,  4.,  5.])

Even though this is fine:

>>> list(np.ones(5)) + a
A object: array([ 1.,  2.,  3.,  4.,  5.])

What happens in the second example is that radd is not called at all, and instead the numpy method __add__ from np.ones(5) is called.

I tried a few suggestions from this post but __array_priority__ does not seem to make any difference (EDIT after seberg comment: at least in numpy 1.7.1, but could work on newer versions), and __set_numeric_ops__ leads to Segmentation Fault... I guess I am doing something wrong.

Any suggestion that works on the simple example above (while keeping __array__ attribute)?

EDIT: I do not want A to be a subclass of np.ndarray, since this would comes with other complications that I want to avoid - for now at least. Note that pandas seems to have got around this problem:

import pandas as pd
df = pd.DataFrame(np.arange(5)) 
type(df.values + df) is pd.DataFrame  # returns True
isinstance(df, np.ndarray) # returns False

I'd be curious to know how this was done.

SOLUTION: in addition to M4rtini solution of subclassing, it is possible to add __array_wrap__ attribute to the class A (to avoid subclassing). More here. According to seberg, __array_priority__ could also work on newer numpy versions (see comment).

like image 281
Mahé Avatar asked Mar 25 '14 11:03

Mahé


People also ask

How could you overload the addition operator in the Python language?

To overload an operator + , you need to provide an implementation to a particular special method in your class. Whenever you use + , a special method called __add__ is invoked. Now you can add up two instances of Bill . The + operator invokes the __add__ method, which knows how to add up two instances of Bill .

Which operator must be defined to support += operation?

You define methods in your class and operators work according to that behavior defined in methods. When we use + operator, the magic method __add__ is automatically invoked in which the operation for + operator is defined. There by changing this magic method's code, we can give extra meaning to the + operator.

How operator overloading can be implemented in Python give an example?

For example, the + operator will perform arithmetic addition on two numbers, merge two lists, or concatenate two strings. This feature in Python that allows the same operator to have different meaning according to the context is called operator overloading.


2 Answers

Make A a subclass of np.ndarray and Python will invoke your A.__radd__ method first.

From the object.__radd__ documentation:

Note: If the right operand’s type is a subclass of the left operand’s type and that subclass provides the reflected method for the operation, this method will be called before the left operand’s non-reflected method. This behavior allows subclasses to override their ancestors’ operations.

By subclassing your A object is indeed able to intercept the addition:

>>> import numpy as np
>>> class A(np.ndarray):
...     """ class overloading a numpy array for addition
...     """
...     def __init__(self, values):
...         self.values = values
...     def __add__(self, x):
...         """ addition
...         """
...         x = np.array(x) # make sure input is numpy compatible
...         return A(self.values + x)
...     def __radd__(self, x):
...         """ reversed-order (LHS <-> RHS) addition
...         """
...         x = np.array(x) # make sure input is numpy compatible
...         return A(x + self.values)
...     def __array__(self):
...         """ so that numpy's array() returns values
...         """
...         return self.values
...     def __repr__(self):
...         return "A object: "+repr(self.values)
... 
>>> a = A(np.arange(5))
>>> a + np.ones(5)  
A object: array([ 1.,  2.,  3.,  4.,  5.])
>>> np.ones(5) + a
A object: array([ 1.,  2.,  3.,  4.,  5.])

Do study the Subclassing ndarray documenation for caveats and implications.

like image 80
Martijn Pieters Avatar answered Oct 03 '22 14:10

Martijn Pieters


Thanks to @M4rtini and @seberg, it seems that adding __array_wrap__ does solve the question:

class A(object):
    ...
    def __array_wrap__(self, result):
        return A(result)  # can add other attributes of self as constructor

It appears to be called at the end of any ufunc operation (it includes array addition). This is also how pandas does it (in 0.12.0, pandas/core/frame.py l. 6020).

like image 33
Mahé Avatar answered Oct 03 '22 13:10

Mahé