I just stumbled upon an interesting(?) way to hide passwords (and other personal data) from general output from screen to logfiles.
In his book How to make mistakes in Python Mike Pirnat suggests to implement a class for sensitive strings and to overload its __str__
- and __repr__
-methods.
I experimented with that and got this:
class secret(str):
def __init__(self, s):
self.string = s
def __repr__(self):
return "'" + "R"*len(self.string) + "'"
def __str__(self):
return "S" * len(self.string)
def __add__(self, other):
return str.__add__(self.__str__(), other)
def __radd__(self, other):
return str.__add__(other, self.__str__())
def __getslice__(self, i, j):
return ("X"*len(self.string))[i:j]
(I'm aware that using len
provides information about the content to hide. It's just for testing.)
It works fine in this cases:
pwd = secret("nothidden")
print("The passwort is " + pwd) # The passwort is SSSSSSSSS
print(pwd + " is the passwort.") # SSSSSSSSS is the password.
print("The passwort is {}.".format(pwd)) # The password is SSSSSSSSS.
print(["The", "passwort", "is", pwd]) # ['The', 'password', 'is', 'RRRRRRRRR']
print(pwd[:]) # XXXXXXXXX
However this does not work:
print(" ".join(["The", "password", "is", pwd])) # The password is nothidden
So, how does str.join() work internally? Which method would I have to overload to obscure the string?
The issue is that you are inheriting from str
, which likely implements __new__
which means that even though you avoided calling the parent constructor in your class, the underlying C object is still initialized with it.
Now join
is probably checking if it has a str
subclass and, being implemented in C, it access directly the underlying C structure, or uses an other str
-related function which bypasses __str__
and __repr__
(think about it: if the value is a string or a string subclass, why would the code call __str__
or __repr__
to obtain its value? It just accesses the underlying character array in some way!)
To fix this: do not inherit from str
! Unfortunately this means you will not be able to use that object exactly like a string in some situations, but that's pretty much inevitable.
An alternative that may work is to implement __new__
and feed a different value to str
's __new__
method:
class secret(str):
def __new__(cls, initializer):
return super(secret, cls).__new__(cls, 'X'*len(initializer))
def __init__(self, initializer):
self.text = initializer
def __repr__(self):
return "'{}'".format("R"*len(self))
def __str__(self):
return "S"*len(self)
def __add__(self, other):
return str(self) + other
def __radd__(self, other):
return other + str(self)
Which results in:
In [19]: pwd = secret('nothidden')
In [20]: print("The passwort is " + pwd) # The passwort is SSSSSSSSS
...: print(pwd + " is the passwort.") # SSSSSSSSS is the password.
...:
...: print("The passwort is {}.".format(pwd)) # The password is SSSSSSSSS.
...: print(["The", "passwort", "is", pwd]) # ['The', 'password', 'is', 'RRRRRRRRR']
...: print(pwd[:])
The passwort is SSSSSSSSS
SSSSSSSSS is the passwort.
The passwort is SSSSSSSSS.
['The', 'passwort', 'is', 'RRRRRRRRR']
XXXXXXXXX
In [21]: print(" ".join(["The", "password", "is", pwd]))
The password is XXXXXXXXX
However I fail to really see how this is useful. I mean: the purpose of this class is to avoid programming errors that end up display sensitive information? But then having an exception being triggered is better so that you can identify the bugs! For that it's probably best to raise NotImplementedError
inside __str__
and __repr__
instead of silently provided a useless value... sure you don't leak the secret but finding bugs becomes really hard.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With