Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the internals of Pythons str.join()? (Hiding passwords from output)

Tags:

python

I just stumbled upon an interesting(?) way to hide passwords (and other personal data) from general output from screen to logfiles.

In his book How to make mistakes in Python Mike Pirnat suggests to implement a class for sensitive strings and to overload its __str__- and __repr__-methods.

I experimented with that and got this:

class secret(str):

    def __init__(self, s):
        self.string = s

    def __repr__(self):
        return "'" + "R"*len(self.string) + "'"

    def __str__(self):
        return "S" * len(self.string)

    def __add__(self, other):
        return str.__add__(self.__str__(), other)

    def __radd__(self, other):
        return str.__add__(other, self.__str__())

    def __getslice__(self, i, j):
        return ("X"*len(self.string))[i:j]

(I'm aware that using len provides information about the content to hide. It's just for testing.)

It works fine in this cases:

pwd = secret("nothidden")

print("The passwort is " + pwd)                  # The passwort is SSSSSSSSS
print(pwd + " is the passwort.")                 # SSSSSSSSS is the password.

print("The passwort is {}.".format(pwd))         # The password is SSSSSSSSS.
print(["The", "passwort", "is", pwd])            # ['The', 'password', 'is', 'RRRRRRRRR']
print(pwd[:])                                    # XXXXXXXXX

However this does not work:

print(" ".join(["The", "password", "is", pwd]))  # The password is nothidden

So, how does str.join() work internally? Which method would I have to overload to obscure the string?

like image 509
Robin Koch Avatar asked Nov 08 '16 10:11

Robin Koch


1 Answers

The issue is that you are inheriting from str, which likely implements __new__ which means that even though you avoided calling the parent constructor in your class, the underlying C object is still initialized with it.

Now join is probably checking if it has a str subclass and, being implemented in C, it access directly the underlying C structure, or uses an other str-related function which bypasses __str__ and __repr__ (think about it: if the value is a string or a string subclass, why would the code call __str__ or __repr__ to obtain its value? It just accesses the underlying character array in some way!)

To fix this: do not inherit from str! Unfortunately this means you will not be able to use that object exactly like a string in some situations, but that's pretty much inevitable.


An alternative that may work is to implement __new__ and feed a different value to str's __new__ method:

class secret(str):
    def __new__(cls, initializer):
        return super(secret, cls).__new__(cls, 'X'*len(initializer))
    def __init__(self, initializer):
        self.text = initializer
    def __repr__(self):
        return "'{}'".format("R"*len(self))
    def __str__(self):
        return "S"*len(self)
    def __add__(self, other):
        return str(self) + other
    def __radd__(self, other):
        return other + str(self)

Which results in:

In [19]: pwd = secret('nothidden')

In [20]: print("The passwort is " + pwd)                  # The passwort is SSSSSSSSS
    ...: print(pwd + " is the passwort.")                 # SSSSSSSSS is the password.
    ...: 
    ...: print("The passwort is {}.".format(pwd))         # The password is SSSSSSSSS.
    ...: print(["The", "passwort", "is", pwd])            # ['The', 'password', 'is', 'RRRRRRRRR']
    ...: print(pwd[:])
The passwort is SSSSSSSSS
SSSSSSSSS is the passwort.
The passwort is SSSSSSSSS.
['The', 'passwort', 'is', 'RRRRRRRRR']
XXXXXXXXX

In [21]: print(" ".join(["The", "password", "is", pwd]))
The password is XXXXXXXXX

However I fail to really see how this is useful. I mean: the purpose of this class is to avoid programming errors that end up display sensitive information? But then having an exception being triggered is better so that you can identify the bugs! For that it's probably best to raise NotImplementedError inside __str__ and __repr__ instead of silently provided a useless value... sure you don't leak the secret but finding bugs becomes really hard.

like image 105
Bakuriu Avatar answered Oct 16 '22 11:10

Bakuriu