Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Starmap modifying a parameter before passing it in?

I have a strange bug that I'm encountering when trying to use multiprocessing.Pool.starmap. The minimum code needed to reproduce the bug is here :

from multiprocessing import Pool

# Ignore the fact that this class is useless as-is, it has more code but it wasn't relevant to the bug
class Coordinate(tuple) :                                                                          

    def __new__(cls, *args):                                                                   
        return tuple.__new__(cls, args)                                                        

#Essentially just stores two coordinates
class Move :                                                     

    def __init__(self, oldPos, newPos) :      
        self.oldPos = oldPos                  
        self.newPos = newPos                  

    def __str__(self) :      
        return 'Old pos : ' + str(self.oldPos) + ' -- New pos : ' + str(self.newPos)

#Dummy function to show the problem
def funcThatNeedsTwoParams(move, otherParam) :
    print(move)             
    # Second param ignored, no problem there

p = Pool(2)  
moveOne = Move(Coordinate(0, 2), Coordinate(0, 1))
moveTwo = Move(Coordinate(2, 1), Coordinate(3, 0))
moveThree = Move(Coordinate(22345, -12400), Coordinate(153, 2357))
# The numbers are irrelevant, no effect on whether problem shows up or not

moves = [moveOne, moveTwo, moveThree]
paramsForStarmap = [[move, 'other param'] for move in moves]

print(paramsForStarmap)
#Output : 
#[[<__main__.Move object at 0x1023d4438>, 'other param'], [<__main__.Move object at 0x1023d4470>, 'other param'], [<__main__.Move object at 0x1023d44a8>
for move in [params[0] for params in paramsForStarmap] :
    print(move)
#Output : 
#Old pos : (0, 2) -- New pos : (0, 1)
#Old pos : (2, 1) -- New pos : (3, 0)
#Old pos : (22345, -12400) -- New pos : (153, 2357)
p.starmap(funcThatNeedsTwoParams, paramsForStarmap)
#Output :
#Old pos : ((0, 2),) -- New pos : ((0, 1),)
#Old pos : ((22345, -12400),) -- New pos : ((153, 2357),)
#Old pos : ((2, 1),) -- New pos : ((3, 0),)

Basically, I have an array of pairs of parameters, something like this : [[move, otherParam], [move, otherParam], ...], I print out every first parameter to show that the moves are valid before using the starmap function. Then I call the starmap function using the pool that was created earlier, and tell it to use the pairs of parameters I have. Then, inexplicably, every move's coordinates become tuples of the form ((coordinate), ), instead of (coordinate).

I can't seem to figure out why starmap would change the properties of an object passed to it, any help would be greatly appreciated, thanks.

like image 973
Marcus Buffett Avatar asked Dec 20 '25 03:12

Marcus Buffett


1 Answers

This is an interesting one. The issue isn't just with starmap. It happens with all Pool functions - apply, map, etc. And, as it turns out, the issue isn't with multiprocessing at all. It's happens when you pickle/unpickle the Coordinate class:

>>> c = Coordinate(0,2)
>>> print(c)
(0, 2)
>>> str(pickle.loads(pickle.dumps(c)))
'((0, 2),)'

Pickling a tuple subclass isn't as straightforward as it looks, at it turns out. You can fix it by defining a __reduce__ method that fixes the pickling process:

class Coordinate(tuple):
    def __new__(cls, *args):
        return tuple.__new__(cls, args)

    def __reduce__(self):
        return (self.__class__, tuple(self))

Now it pickles just fine:

>>> c = Coordinate(0,2)
>>> pickle.loads(pickle.dumps(c))
(0, 2)

And your example code works fine, too.

like image 122
dano Avatar answered Dec 21 '25 16:12

dano



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!