Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

memory usage @on_trait_change vs _foo_changed()

I did built an application with Enthought Traits, which is using too much memory. I think, the problem is caused by trait notifications:

There seems to be a fundamental difference in memory usage of events caught by @on_trait_change or by using the special naming convention (e.g. _foo_changed() ). I made a little example with two classes Foo and FooDecorator, which i assumed to show exactly the same behaviour. But they don't!

from traits.api import *

class Foo(HasTraits):
    a = List(Int)

    def _a_changed(self):
        pass

    def _a_items_changed(self):
        pass

class FooDecorator(HasTraits):
    a = List(Int)

    @on_trait_change('a[]')
    def bar(self):
        pass

if __name__ == '__main__':
    n = 100000
    c = FooDecorator
    a = [c() for i in range(n)]

When running this script with c = Foo, Windows task manager shows a memory usage for the whole python process of 70MB, which stays constant for increasing n. For c = FooDecorator, the python process is using 450MB, increasing for higher n.

Can you please explain this behaviour to me?

EDIT: Maybe i should rephrase: Why would anyone choose FooDecorator over Foo?

EDIT 2: I just uninstalled python(x,y) 2.7.9 and installed the newest version of canopy with traits 4.5.0. Now the 450MB became 750MB.

EDIT 3: Compiled traits-4.6.0.dev0-py2.7-win-amd64 myself. The outcome is the same as in EDIT 2. So despite all plausibility https://github.com/enthought/traits/pull/248/files does not seem to be the cause.

like image 224
HeinzKurt Avatar asked Jul 28 '15 16:07

HeinzKurt


2 Answers

I believe you are seeing the effect of a memory leak that has been fixed recently: https://github.com/enthought/traits/pull/248/files

As for why one would use the decorator, in this particular instance the two versions are practically equivalent.

In general, the decorator is more flexible: you can give a list of traits to listen to, and you can use the extended name notation, as described here: http://docs.enthought.com/traits/traits_user_manual/notification.html#semantics

For example, in this case:

class Bar(HasTraits):
    b = Str

class FooDecorator(HasTraits):
    a = List(Bar)

    @on_trait_change('a.b')
    def bar(self):
        print 'change'

the bar notifier is going to be called for changes to the trait a, its items, and for the change of the trait b in each of the Bar items. Extended names can be quite powerful.

like image 190
pberkes Avatar answered Sep 21 '22 13:09

pberkes


What's going on here is that Traits has two distinct ways of handling notifications: static notifiers and dynamic notifiers.

Static notifiers (such as those created by the specially-named _*_changed() methods) are fairly light-weight: each trait on an instance has a list of notifiers on t, which are basically the functions or methods with a lightweight wrapper.

Dynamic notifiers (such as those created with on_trait_change() and the extended trait name conventions like a[] are significantly more powerful and flexible, but as a result they are much more heavy-weight. In particular, in addition to the wrapper object they create, they also create a parsed representation of the extended trait name and a handler object, some of which are in-turn HasTraits subclass instances.

As a result, even for a simple expression like a[] there will be a fair number of new Python objects created, and these objects have to be created for every on_trait_change listener on every instance separately to properly handle corner-cases like instance traits. The relevant code is here: https://github.com/enthought/traits/blob/master/traits/has_traits.py#L2330

Base on the reported numbers, the majority of the difference in memory usage that you are seeing is in the creation of this dynamic listener infrastructure for each instance and each on_trait_change decorator.

It's worth noting that there is a short-circuit for on_trait_change in the case where you are using a simple trait name, in which case it generates a static trait notifier instead of a dynamic notifier. So if you were to instead write something like:

class FooSimpleDecorator(HasTraits):
    a = List(Int)

    @on_trait_change('a')
    def a_updated(self):
        pass

    @on_trait_change('a_items')
    def a_items_updated(self):
        pass

you should see similar memory performance to the specially-named methods.

To answer the rephrased question about "why use on_trait_change", in FooDecorator you can write one method instead of two if your response to a change of either the list or any items in the list is the same. This makes code significantly easier to debug and maintain, and if you aren't creating thousands of these objects then the extra memory usage is negligible.

This becomes even more of a factor when you consider more sophisticated extended trait name patterns, where the dynamic listeners automatically handle changes which would otherwise require significant manual (and error-prone) code for hooking up and removing listeners from intermediate objects and traits. The power and simplicity of this approach usually outweighs the concerns about memory usage.

like image 4
Corran Webster Avatar answered Sep 21 '22 13:09

Corran Webster