Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to keep track of instances of python objects in a reliable way?

I would like to be able to keep track of instances of geometric Point objects in order to know what names are already "taken" when automatically naming a new one.

For instance, if Points named "A", "B" and "C" have been created, then the next automatically named Point is named "D". If Point named "D" gets deleted, or its reference gets lost, then name "D" becomes available again.

The main attributes of my Point objects are defined as properties and are the quite standard x, y and name.

Solution with a problem and a "heavy" workaround

I proceeded as described here, using a weakref.WeakSet(). I added this to my Point class:

# class attribute
instances = weakref.WeakSet()

@classmethod
def names_in_use(cls):
    return {p.name for p in Point.instances}

Problem is, when I instanciate a Point and then delete it, it is most of the time, but not always, removed from Point.instances. I noticed that, if I run the tests suite (pytest -x -vv -r w), then if a certain exception is raised in the test, then the instance never gets deleted (probable explanation to be read somewhat below).

In the following test code, after the first deletion of p, it always gets removed from Point.instances, but after the second deletion of p, it never gets deleted (test results are always the same) and the last assert statement fails:

def test_instances():
    import sys
    p = Point(0, 0, 'A')
    del p
    sys.stderr.write('1 - Point.instances={}\n'.format(Point.instances))
    assert len(Point.instances) == 0
    assert Point.names_in_use() == set()
    p = Point(0, 0, 'A')
    with pytest.raises(TypeError) as excinfo:
        p.same_as('B')
    assert str(excinfo.value) == 'Can only test if another Point is at the ' \
        'same place. Got a <class \'str\'> instead.'
    del p
    sys.stderr.write('2 - Point.instances={}\n'.format(Point.instances))
    assert len(Point.instances) == 0

And here the result:

tests/04_geometry/01_point_test.py::test_instances FAILED

=============================================================================== FAILURES ===============================================================================
____________________________________________________________________________ test_instances ____________________________________________________________________________

    def test_instances():
        import sys
        p = Point(0, 0, 'A')
        del p
        sys.stderr.write('1 - Point.instances={}\n'.format(Point.instances))
        assert len(Point.instances) == 0
        assert Point.names_in_use() == set()
        p = Point(0, 0, 'A')
        with pytest.raises(TypeError) as excinfo:
            p.same_as('B')
        assert str(excinfo.value) == 'Can only test if another Point is at the ' \
            'same place. Got a <class \'str\'> instead.'
        del p
        sys.stderr.write('2 - Point.instances={}\n'.format(Point.instances))
>       assert len(Point.instances) == 0
E       assert 1 == 0
E        +  where 1 = len(<_weakrefset.WeakSet object at 0x7ffb986a5048>)
E        +    where <_weakrefset.WeakSet object at 0x7ffb986a5048> = Point.instances

tests/04_geometry/01_point_test.py:42: AssertionError
------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------
1 - Point.instances=<_weakrefset.WeakSet object at 0x7ffb986a5048>
2 - Point.instances=<_weakrefset.WeakSet object at 0x7ffb986a5048>
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
================================================================= 1 failed, 82 passed in 0.36 seconds ==================================================================

Yet, the code tested in the catched exception does not create a new Point instance:

def same_as(self, other):
    """Test geometric equality."""
    if not isinstance(other, Point):
        raise TypeError('Can only test if another Point is at the same '
                        'place. Got a {} instead.'.format(type(other)))
    return self.coordinates == other.coordinates

and coordinates are basically:

@property
def coordinates(self):
    return (self._x, self._y)

where _x and _y basically contain numbers.

The reason seems to be (quoting from python's doc):

CPython implementation detail: It is possible for a reference cycle to prevent the reference count of an object from going to zero. In this case, the cycle will be later detected and deleted by the cyclic garbage collector. A common cause of reference cycles is when an exception has been caught in a local variable.

The workaround

Adding this method to Point class:

def untrack(self):
    Point.instances.discard(self)

and using myPoint.untrack() before del myPoint (or before losing reference to the Point in another way) seems to solve the problem.

But this is quite heavy to have to call untrack() each time... in my tests there are a lot of Points I will need to "untrack" only to ensure all names are available, for instance.

Question

Is there any better way to keep track of these instances? (either by improving the tracking method used here, or by any other better mean).

like image 946
zezollo Avatar asked Jan 30 '18 19:01

zezollo


1 Answers

Don't try to track available names based on all Point objects that exist in the entire program. Predicting what objects will exist and when objects will cease to exist is difficult and unnecessary, and it will behave very differently on different Python implementations.

First, why are you trying to enforce Point name uniqueness at all? If, for example, you're drawing a figure in some window and you don't want two points with the same label in the same figure, then have the figure track the points in it and reject a new point with a taken name. This also makes it easy to explicitly remove points from a figure, or have two figures with independent point names. There are a number of other contexts where a similar explicit container object may be reasonable.

If these are free-floating points not attached to some geometry environment, then why name them at all? If I want to represent a point at (3.5, 2.4), I don't care whether I name it A or B or Bob, and I certainly don't want a crash because some other code somewhere halfway across the program decided to call their point Bob too. Why do names or name collisions matter?

I don't know what your use case is, but for most I can imagine, it'd be best to either only enforce name uniqueness within an explicit container, or not enforce name uniqueness at all.

like image 140
user2357112 supports Monica Avatar answered Oct 12 '22 23:10

user2357112 supports Monica