Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python How to create private class variables using setattr or exec?

I've just run into a situation where pseudo-private class member names aren't getting mangled when using setattr or exec.

In [1]: class T:
   ...:     def __init__(self, **kwargs):
   ...:         self.__x = 1
   ...:         for k, v in kwargs.items():
   ...:             setattr(self, "__%s" % k, v)
   ...:         
In [2]: T(y=2).__dict__
Out[2]: {'_T__x': 1, '__y': 2}

I've tried exec("self.__%s = %s" % (k, v)) as well with the same result:

In [1]: class T:
   ...:     def __init__(self, **kwargs):
   ...:         self.__x = 1
   ...:         for k, v in kwargs.items():
   ...:             exec("self.__%s = %s" % (k, v))
   ...:         
In [2]: T(z=3).__dict__
Out[2]: {'_T__x': 1, '__z': 3}

Doing self.__dict__["_%s__%s" % (self.__class__.__name__, k)] = v would work, but __dict__ is a readonly attribute.

Is there another way that I can dynamically create these psuedo-private class members (without hard-coding in the name mangling)?


A better way to phrase my question:

What does python do “under the hood” when it encounters a double underscore (self.__x) attribute being set? Is there a magic function that is used to do the mangling?

like image 902
chown Avatar asked Oct 16 '11 03:10

chown


2 Answers

I believe Python does private attribute mangling during compilation... in particular, it occurs at the stage where it has just parsed the source into an abstract syntax tree, and is rendering it to byte code. This is the only time during execution that the VM knows the name of the class within whose (lexical) scope the function is defined. It then mangles psuedo-private attributes and variables, and leaves everything else unchanged. This has a couple of implications...

  • String constants in particular are not mangled, which is why your setattr(self, "__X", x) is being left alone.

  • Since mangling relies on the lexical scope of the function within the source, functions defined outside of the class and then "inserted" do not have any mangling done, since the information about the class they "belong to" was not known at compile-time.

  • As far as I know, there isn't an easy way to determine (at runtime) what class a function was defined in... At least not without a lot of inspect calls that rely on source reflection to compare line numbers between the function and class sources. Even that approach isn't 100% reliable, there are border cases that can cause erroneous results.

  • The process is actually rather indelicate about the mangling - if you try to access the __X attribute on an object that isn't an instance of the class the function is lexically defined within, it'll still mangle it for that class... letting you store private class attrs in instances of other objects! (I'd almost argue this last point is a feature, not a bug)

So the variable mangling is going to have to be done manually, so that you calculate what the mangled attr should be in order to call setattr.


Regarding the mangling itself, it's done by the _Py_Mangle function, which uses the following logic:

  • __X gets an underscore and the class name prepended. E.g. if it's Test, the mangled attr is _Test__X.
  • The only exception is if the class name begins with any underscores, these are stripped off. E.g. if the class is __Test, the mangled attr is still _Test__X.
  • Trailing underscores in a class name are not stripped.

To wrap this all up in a function...

def mangle_attr(source, attr):
    # return public attrs unchanged
    if not attr.startswith("__") or attr.endswith("__") or '.' in attr:
        return attr
    # if source is an object, get the class
    if not hasattr(source, "__bases__"):
        source = source.__class__
    # mangle attr
    return "_%s%s" % (source.__name__.lstrip("_"), attr)

I know this somewhat "hardcodes" the name mangling, but it is at least isolated to a single function. It can then be used to mangle strings for setattr:

# you should then be able to use this w/in the code...
setattr(self, mangle_attr(self, "__X"), value)

# note that would set the private attr for type(self),
# if you wanted to set the private attr of a specific class,
# you'd have to choose it explicitly...
setattr(self, mangle_attr(somecls, "__X"), value)

Alternately, the following mangle_attr implementation uses an eval so that it always uses Python's current mangling logic (though I don't think the logic laid out above has ever changed)...

_mangle_template = """
class {cls}:
    @staticmethod
    def mangle():
        {attr} = 1
cls = {cls}
"""

def mangle_attr(source, attr):
    # if source is an object, get the class
    if not hasattr(source, "__bases__"):
        source = source.__class__
    # mangle attr
    tmp = {}
    code = _mangle_template.format(cls=source.__name__, attr=attr)
    eval(compile(code, '', 'exec'), {}, tmp); 
    return tmp['cls'].mangle.__code__.co_varnames[0]

# NOTE: the '__code__' attr above needs to be 'func_code' for python 2.5 and older
like image 155
Eli Collins Avatar answered Sep 18 '22 14:09

Eli Collins


Addressing this:

What does python do “under the hood” when it encounters a double underscore (self.__x) attribute being set? Is there a magic function that is used to do the mangling?

AFAIK, it's basically special cased in the compiler. So once it's in bytecode, the name is already mangled; the interpreter never sees the unmangled name at all, and had no idea of any special handling needed. This is why references through setattr, exec, or by looking up a string in __dict__ don't work; the compiler sees all of those as strings, and doesn't know that they have anything to do with attribute access, so it passes them through unchanged. The interpreter knows nothing of the name mangling, so it just uses them directly.

The times I've needed to get around this, I've just manually done the same name mangling, hacky as that is. I've found that using these 'private' names is generally a bad idea, unless it's a case where you know you need them for their intended purpose: to allow an inheritance hierarchy of classes to all use the same attribute name but have a copy per class. Peppering attribute names with double underscores just because they're supposed to be private implementation details seems to cause more harm than benefit; I've taken to just using a single underscore as a hint that external code shouldn't be touching it.

like image 44
Ben Avatar answered Sep 18 '22 14:09

Ben