I come from C++, and I am struggling to get a sense of safety when programming in Python (for instance misspelling can create extremely hard to find bugs, but that is not the point here). Here I would like to understand how I can avoid doing horrible things by adhering to good practices.
The simple function below is perfectly fine in c++ but creates what I can only call a monstrosity in Python.
def fun(x):
x += 1
x = x + 1
return x
When I call it
var1 = 1;
print(fun(var1), var1)
var2 = np.array([1]);
print(fun(var2), var2)
I get
3 1
[3] [2]
Apart from the lack of homogeneous behaviour (which is already terrible), the second case is particularly hideous. The external variable is modified only by some of the instructions!
I know in details why it happens. So that is not my question. The point is that when constructing a complex program, I do not want to have to be extra careful with all these context-dependent and highly implicit technicalities.
There must be some good practice I can strictly adhere to that will prevent me from inadvertently producing the code above. I can think of ways, but they seem to overcomplicate the code, making C++ look like a more high level language.
What good practice should I follow to avoid that monstrosity?
Thanks!
[EDIT] Some clarification: What I struggle with is the fact that Python makes a type-dependent and context-dependent choice of creating a temporary. Again, I know the rules. However in C++ the choice is done by the programmer and clear throughout the whole function, while that is not the case in Python. Python requires the programmer to know quite some technicalities of the operations done on the argument in order to figure out if at that point Python is working on a temporary or on the original.
Notice that I constructed a function which both returns a value and has a side effect just to show my point.
The point is that a programmer might want to write that function to simply have side effects (no return statement), and midway through the function Python decides to build a temporary, so some side effects are not applied. On the other hand the programmer might not want side effects, and instead get some (and hard to predict ones).
In C++ the above is simply and clearly handled. In Python it is rather technical and requires knowing what triggers the generation of temporaries and what not. As I need to explain this to my students, I would like to give them a simple rule that will prevent them from falling into those traps.
Good practices to avoid such pitfalls:
list.sort
)sorted
)Your fun
does both, which goes against the conventions followed by most standard library code and popular third-party Python libraries. Breaking this "unwritten rule" is the cause of the particularly hideous result there.
Generally speaking, it's best if functions are kept "pure" when possible. It's easier to reason about a pure and stateless function, and they're easier to test.
A "sense of safety" when programming in Python comes from having a good test suite. As an interpreted and dynamic programming language, almost everything in Python happens at runtime. There is very little to protect you at compile time - pretty much only the syntax errors will be found. This is great for flexibility, e.g. virtually anything can be monkeypatched at runtime. With great power comes great responsibility. It is not unusual for a Python project to have twice as much test code as there is library code.
The one good practice that jumps to mind is command-query separation:
A function or method should only ever either compute and return something, or do something, at least when it comes to outside-observable behavior.
There's very few exceptions acceptable (think e.g. the pop
method of a Stack
data structure: It returns something, and does something) but those tend to be in places where it's so idiomatic, you wouldn't expect it any other way.
And when a function does something to its input values, that should be that function's sole purpose. That way, there's no nasty surprises.
Now for the inconsistent behavior between a "primitive" type and a more complex type, it's easiest to code defensively and assume that it's a reference anyway.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With