Python name mangling

People also ask

What is meant by name mangling?

How does the C++ compiler distinguish between different functions when it generates object code – it changes names by adding information about arguments. This technique of adding additional information to function names is called Name Mangling.

What is extern and name mangling?

Name Mangling and extern “C” in C++Using this feature, we can create functions with same name. The only difference is the type of the arguments, and the number of arguments. The return type is not considered here.

Why do we use __ in Python?

The Python interpreter modifies the variable name with ___. So Multiple times It uses as a Private member because another class can not access that variable directly. The main purpose for __ is to use variable /method in class only If you want to use it outside of the class you can make it public.

What is name mangling in Java?

Name mangling is a term that denotes the process of mapping a name that is valid in a particular programming language to a name that is valid in the CORBA Interface Definition Language (IDL).

When in doubt, leave it "public" - I mean, do not add anything to obscure the name of your attribute. If you have a class with some internal value, do not bother about it. Instead of writing:

class Stack(object):

    def __init__(self):
        self.__storage = [] # Too uptight

    def push(self, value):
        self.__storage.append(value)

write this by default:

class Stack(object):

    def __init__(self):
        self.storage = [] # No mangling

    def push(self, value):
        self.storage.append(value)

This is for sure a controversial way of doing things. Python newbies hate it, and even some old Python guys despise this default - but it is the default anyway, so I recommend you to follow it, even if you feel uncomfortable.

If you really want to send the message "Can't touch this!" to your users, the usual way is to precede the variable with one underscore. This is just a convention, but people understand it and take double care when dealing with such stuff:

class Stack(object):

    def __init__(self):
        self._storage = [] # This is ok, but Pythonistas use it to be relaxed about it

    def push(self, value):
        self._storage.append(value)

This can be useful, too, for avoiding conflict between property names and attribute names:

 class Person(object):
     def __init__(self, name, age):
         self.name = name
         self._age = age if age >= 0 else 0
     
     @property
     def age(self):
         return self._age
     
     @age.setter
     def age(self, age):
         if age >= 0:
             self._age = age
         else:
             self._age  = 0

What about the double underscore? Well, we use the double underscore magic mainly to avoid accidental overloading of methods and name conflicts with superclasses' attributes. It can be pretty valuable if you write a class to be extended many times.

If you want to use it for other purposes, you can, but it is neither usual nor recommended.

EDIT: Why is this so? Well, the usual Python style does not emphasize making things private - on the contrary! There are many reasons for that - most of them controversial... Let us see some of them.

Python has properties

Today, most OO languages use the opposite approach: what should not be used should not be visible, so attributes should be private. Theoretically, this would yield more manageable, less coupled classes because no one would change the objects' values recklessly.

However, it is not so simple. For example, Java classes have many getters that only get the values and setters that only set the values. You need, let us say, seven lines of code to declare a single attribute - which a Python programmer would say is needlessly complex. Also, you write a lot of code to get one public field since you can change its value using the getters and setters in practice.

So why follow this private-by-default policy? Just make your attributes public by default. Of course, this is problematic in Java because if you decide to add some validation to your attribute, it would require you to change all:

person.age = age;

in your code to, let us say,

person.setAge(age);

setAge() being:

public void setAge(int age) {
    if (age >= 0) {
        this.age = age;
    } else {
        this.age = 0;
    }
}

So in Java (and other languages), the default is to use getters and setters anyway because they can be annoying to write but can spare you much time if you find yourself in the situation I've described.

However, you do not need to do it in Python since Python has properties. If you have this class:

 class Person(object):
     def __init__(self, name, age):
         self.name = name
         self.age = age

...and then you decide to validate ages, you do not need to change the person.age = age pieces of your code. Just add a property (as shown below)

 class Person(object):
     def __init__(self, name, age):
         self.name = name
         self._age = age if age >= 0 else 0
     
     @property
     def age(self):
         return self._age
     
     @age.setter
     def age(self, age):
         if age >= 0:
             self._age = age
         else:
             self._age  = 0

Suppose you can do it and still use person.age = age, why would you add private fields and getters and setters?

(Also, see Python is not Java and this article about the harms of using getters and setters.).

Everything is visible anyway - and trying to hide complicates your work

Even in languages with private attributes, you can access them through some reflection/introspection library. And people do it a lot, in frameworks and for solving urgent needs. The problem is that introspection libraries are just a complicated way of doing what you could do with public attributes.

Since Python is a very dynamic language, adding this burden to your classes is counterproductive.

The problem is not being possible to see - it is being required to see

For a Pythonista, encapsulation is not the inability to see the internals of classes but the possibility of avoiding looking at it. Encapsulation is the property of a component that the user can use without concerning about the internal details. If you can use a component without bothering yourself about its implementation, then it is encapsulated (in the opinion of a Python programmer).

Now, if you wrote a class you can use it without thinking about implementation details, there is no problem if you want to look inside the class for some reason. The point is: your API should be good, and the rest is details.

Guido said so

Well, this is not controversial: he said so, actually. (Look for "open kimono.")

This is culture

Yes, there are some reasons, but no critical reason. This is primarily a cultural aspect of programming in Python. Frankly, it could be the other way, too - but it is not. Also, you could just as easily ask the other way around: why do some languages use private attributes by default? For the same main reason as for the Python practice: because it is the culture of these languages, and each choice has advantages and disadvantages.

Since there already is this culture, you are well-advised to follow it. Otherwise, you will get annoyed by Python programmers telling you to remove the __ from your code when you ask a question in Stack Overflow :)

First - What is name mangling?

Name mangling is invoked when you are in a class definition and use __any_name or __any_name_, that is, two (or more) leading underscores and at most one trailing underscore.

class Demo:
    __any_name = "__any_name"
    __any_other_name_ = "__any_other_name_"

And now:

>>> [n for n in dir(Demo) if 'any' in n]
['_Demo__any_name', '_Demo__any_other_name_']
>>> Demo._Demo__any_name
'__any_name'
>>> Demo._Demo__any_other_name_
'__any_other_name_'

When in doubt, do what?

The ostensible use is to prevent subclassers from using an attribute that the class uses.

A potential value is in avoiding name collisions with subclassers who want to override behavior, so that the parent class functionality keeps working as expected. However, the example in the Python documentation is not Liskov substitutable, and no examples come to mind where I have found this useful.

The downsides are that it increases cognitive load for reading and understanding a code base, and especially so when debugging where you see the double underscore name in the source and a mangled name in the debugger.

My personal approach is to intentionally avoid it. I work on a very large code base. The rare uses of it stick out like a sore thumb and do not seem justified.

You do need to be aware of it so you know it when you see it.

PEP 8

PEP 8, the Python standard library style guide, currently says (abridged):

There is some controversy about the use of __names.

If your class is intended to be subclassed, and you have attributes that you do not want subclasses to use, consider naming them with double leading underscores and no trailing underscores.

Note that only the simple class name is used in the mangled name, so if a subclass chooses both the same class name and attribute name, you can still get name collisions.

Name mangling can make certain uses, such as debugging and __getattr__() , less convenient. However the name mangling algorithm is well documented and easy to perform manually.

Not everyone likes name mangling. Try to balance the need to avoid accidental name clashes with potential use by advanced callers.

How does it work?

If you prepend two underscores (without ending double-underscores) in a class definition, the name will be mangled, and an underscore followed by the class name will be prepended on the object:

>>> class Foo(object):
...     __foobar = None
...     _foobaz = None
...     __fooquux__ = None
... 
>>> [name for name in dir(Foo) if 'foo' in name]
['_Foo__foobar', '__fooquux__', '_foobaz']

Note that names will only get mangled when the class definition is parsed:

>>> Foo.__test = None
>>> Foo.__test
>>> Foo._Foo__test
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'Foo' has no attribute '_Foo__test'

Also, those new to Python sometimes have trouble understanding what's going on when they can't manually access a name they see defined in a class definition. This is not a strong reason against it, but it's something to consider if you have a learning audience.

One Underscore?

If the convention is to use only one underscore, I'd also like to know the rationale.

When my intention is for users to keep their hands off an attribute, I tend to only use the one underscore, but that's because in my mental model, subclassers would have access to the name (which they always have, as they can easily spot the mangled name anyways).

If I were reviewing code that uses the __ prefix, I would ask why they're invoking name mangling, and if they couldn't do just as well with a single underscore, keeping in mind that if subclassers choose the same names for the class and class attribute there will be a name collision in spite of this.

I wouldn't say that practice produces better code. Visibility modifiers only distract you from the task at hand, and as a side effect force your interface to be used as you intended. Generally speaking, enforcing visibility prevents programmers from messing things up if they haven't read the documentation properly.

A far better solution is the route that Python encourages: Your classes and variables should be well documented, and their behaviour clear. The source should be available. This is far more extensible and reliable way to write code.

My strategy in Python is this:

Just write the damn thing, make no assumptions about how your data should be protected. This assumes that you write to create the ideal interfaces for your problems.
Use a leading underscore for stuff that probably won't be used externally, and isn't part of the normal "client code" interface.
Use double underscore only for things that are purely convenience inside the class, or will cause considerable damage if accidentally exposed.

Above all, it should be clear what everything does. Document it if someone else will be using it. Document it if you want it to be useful in a year's time.

As a side note, you should actually be going with protected in those other languages: You never know your class might be inherited later and for what it might be used. Best to only protect those variables that you are certain cannot or should not be used by foreign code.

You shouldn't start with private data and make it public as necessary. Rather, you should start by figuring out the interface of your object. I.e. you should start by figuring out what the world sees (the public stuff) and then figure out what private stuff is necessary for that to happen.

Other language make difficult to make private that which once was public. I.e. I'll break lots of code if I make my variable private or protected. But with properties in python this isn't the case. Rather, I can maintain the same interface even with rearranging the internal data.

The difference between _ and __ is that python actually makes an attempt to enforce the latter. Of course, it doesn't try really hard but it does make it difficult. Having _ merely tells other programmers what the intention is, they are free to ignore at their peril. But ignoring that rule is sometimes helpful. Examples include debugging, temporary hacks, and working with third party code that wasn't intended to be used the way you use it.

Related questions
                            
                                Threading in a PyQt application: Use Qt threads or Python threads?
                            
                                What is the difference between setUp() and setUpClass() in Python unittest?
                            
                                What is the most pythonic way to check if an object is a number?
                            
                                Revert the `--no-site-packages` option with virtualenv
                            
                                Reading in environment variables from an environment file
                            
                                How to programmatically generate markdown output in Jupyter notebooks?
                            
                                Creating functions in a loop
                            
                                Matplotlib connect scatterplot points with line - Python
                            
                                Convert pandas timezone-aware DateTimeIndex to naive timestamp, but in certain timezone
                            
                                Python Requests package: Handling xml response
                            
                                How to migrate back from initial migration in Django 1.7?
                            
                                Creating a zero-filled pandas data frame
                            
                                Reading a binary file with python
                            
                                How to read keyboard-input?
                            
                                Converting strings to floats in a DataFrame
                            
                                Replace and overwrite instead of appending
                            
                                Using List/Tuple/etc. from typing vs directly referring type as list/tuple/etc
                            
                                Regular expression to match a dot
                            
                                Remove leading and trailing spaces?
                            
                                How can I remove Nan from list Python/NumPy

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With