I am working with a Python object that implements __add__
, but does not subclass int
. MyObj1 + MyObj2
works fine, but sum([MyObj1, MyObj2])
led to a TypeError
, becausesum()
first attempts 0 + MyObj
. In order to use sum()
, my object needs __radd__
to handle MyObj + 0
or I need to provide an empty object as the start
parameter. The object in question is not designed to be empty.
Before anyone asks, the object is not list-like or string-like, so use of join() or itertools would not help.
Edit for details: the module has a SimpleLocation and a CompoundLocation. I'll abbreviate Location to Loc. A SimpleLoc
contains one right-open interval, i.e. [start, end). Adding SimpleLoc
yields a CompoundLoc
, which contains a list of the intervals, e.g. [[3, 6), [10, 13)]
. End uses include iterating through the union, e.g. [3, 4, 5, 10, 11, 12]
, checking length, and checking membership.
The numbers can be relatively large (say, smaller than 2^32 but commonly 2^20). The intervals probably won't be extremely long (100-2000, but could be longer). Currently, only the endpoints are stored. I am now tentatively thinking of attempting to subclass set
such that the location is constructed as set(xrange(start, end))
. However, adding sets will give Python (and mathematicians) fits.
Questions I've looked at:
I'm considering two solutions. One is to avoid sum()
and use the loop offered in this comment. I don't understand why sum()
begins by adding the 0th item of the iterable to 0 rather than adding the 0th and 1st items (like the loop in the linked comment); I hope there's an arcane integer optimization reason.
My other solution is as follows; while I don't like the hard-coded zero check, it's the only way I've been able to make sum()
work.
# ...
def __radd__(self, other):
# This allows sum() to work (the default start value is zero)
if other == 0:
return self
return self.__add__(other)
In summary, is there another way to use sum()
on objects that can neither be added to integers nor be empty?
Instead of sum
, use:
import operator
from functools import reduce
reduce(operator.add, seq)
in Python 2 reduce
was built-in so this looks like:
import operator
reduce(operator.add, seq)
Reduce is generally more flexible than sum - you can provide any binary function, not only add
, and you can optionally provide an initial element while sum
always uses one.
Also note: (Warning: maths rant ahead)
Providing support for add
w/r/t objects that have no neutral element is a bit awkward from the algebraic points of view.
Note that all of:
together with addition form a Monoid - i.e. they are associative and have some kind of neutral element.
If your operation isn't associative and doesn't have a neutral element, then it doesn't "resemble" addition. Hence, don't expect it to work well with sum
.
In such case, you might be better off with using a function or a method instead of an operator. This may be less confusing since the users of your class, seeing that it supports +
, are likely to expect that it will behave in a monoidic way (as addition normally does).
Thanks for expanding, I'll refer to your particular module now:
There are 2 concepts here:
It indeed makes sense that simple locations could be added, but they don't form a monoid because their addition doesn't satisfy the basic property of closure - the sum of two SimpleLocs isn't a SimpleLoc. It's, generally, a CompoundLoc.
OTOH, CompoundLocs with addition looks like a monoid to me (a commutative monoid, while we're at it): A sum of those is a CompoundLoc too, and their addition is associative, commutative and the neutral element is an empty CompoundLoc that contains zero SimpleLocs.
If you agree with me (and the above matches your implementation), then you'll be able to use sum
as following:
sum( [SimpleLoc1, SimpleLoc2, SimpleLoc3], start=ComplexLoc() )
Indeed, this appears to work.
I am now tentatively thinking of attempting to subclass set such that the location is constructed as set(xrange(start, end)). However, adding sets will give Python (and mathematicians) fits.
Well, locations are some sets of numbers, so it makes sense to throw a set-like interface on top of them (so __contains__
, __iter__
, __len__
, perhaps __or__
as an alias of +
, __and__
as the product, etc).
As for construction from xrange
, do you really need it? If you know that you're storing sets of intervals, then you're likely to save space by sticking to your representation of [start, end)
pairs. You could throw in an utility method that takes an arbitrary sequence of integers and translates it to an optimal SimpleLoc
or CompoundLoc
if you feel it's going to help.
I think that the best way to accomplish this is to provide the __radd__
method, or pass the start object to sum explicitly.
In case you really do not want to override __radd__
or provide a start object, how about redefining sum()
?
>>> from __builtin__ import sum as builtin_sum
>>> def sum(iterable, startobj=MyCustomStartObject):
... return builtin_sum(iterable, startobj)
...
Preferably use a function with a name like my_sum()
, but I guess that is one of the things you want to avoid (even though globally redefining builtin functions is probably something that a future maintainer will curse you for)
Actually, implementing __add__
without the concept of an "empty object" makes little sense. sum
needs a start
parameter to support the sums of empty and one-element sequences, and you have to decide what result you expect in these cases:
sum([o1, o2]) => o1 + o2 # obviously
sum([o1]) => o1 # But how should __add__ be called here? Not at all?
sum([]) => ? # What now?
You could use an object that's universally neutral wrt. addition:
class Neutral:
def __add__(self, other):
return other
print(sum("A BC D EFG".split(), Neutral())) # ABCDEFG
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With