Do Python's stylistic best practices apply to scientific coding?
I am finding it difficult to keep scientific Python code readable.
For example, it is suggested to use meaningful names for variables and to keep the namespace ordered by avoiding import *
. Thus, e.g. :
import numpy as np
normbar = np.random.normal(mean, std, np.shape(foo))
But these suggestions can lead to some difficult-to-read code, especially given the 79-character line width. For example, I just wrote the following operation:
net["weights"][ix1][ix2] += lrate * (CD / nCases - opts["weightcost_pretrain"].dot(net["weights"][ix1][ix2]))
I can span the expression across lines:
net["weights"][ix1][ix2] += lrate * (CD / nCases -
opts["weightcost_pretrain"].dot(net["weights"][ix1][ix2]))
but this does not seem much better, and I am not sure how deep to indent the second line. These kinds of line continuations become even trickier when one is double-indented into a nested loop, and there are only 50 characters available on a line.
Should I accept that scientific Python looks clunky, or are there ways to avoid lines like the example above?
Some potential approaches are:
I would appreciate any wisdom on which of these to pursue and which to avoid, as well as suggestions for other remedies.
- defining helper functions for combinations of arithmetic operations
- breaking operations into smaller pieces, and placing one on each line
These are both good ideas—in keeping with the intent behind PEP 8, and with Pythonic style in general. In fact, whenever someone suggests modifying PEP 8 to give more information about long lines, half the responses are usually "If you're going over the line limit, you're probably doing too much in one expression".
And, more generally, factoring out code and giving sensible names to sensible operations are always a good idea.
Of course without knowing exactly what all these things represent, I can only guess at how to split them up, but I think something like this would be pretty readable and meaningful:
cost = opts["weightcost_pretrain"].dot(net["weights"][ix1][ix2])
weight = lrate * (CD / nCases - cost)
net["weights"][ix1][ix2] += weight
I think the style guide always applies- I use Python daily for scientific work and find that I'm able to read my code more easily and come back to it months later with little effort if I've split up long lines into logical components and sensible variable names, or used a function.
I'd do something more like this:
weights = net["weights"][ix1][ix2]
opts_arr = opts["weightcost_pretrain"]
weights += lrate * (CD / nCases - opts_arr.dot(weights))
Another way of saying that Python is "concise" is that Python is syntactically dense, and I find it harder to read and understand a long line of Python than a long line of Java (especially when using high-level functions from 3rd party libraries that hide low-level logic, like NumPy).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With