I am using a Homebrew-installed Python on my Mac (running OS X 10.13.1), and of late, I’ve noticed that the interpreter takes a frustratingly long time to start up.
In setting out to try to solve this problem, I did a simple check with time
:
PIPER-ALPHA:~$ time bpython -c 'pass'
real 0m12.141s
user 0m1.662s
sys 0m10.073s
… which revealed the egregiousness of the issue: 12 seconds!
I then used gnomon
– an extremely handy npm
module one uses for itemizing the timing of CLI tools – to winnow the issue down to the offending Python module. I used this command:
PIPER-ALPHA:~$ PYTHONVERBOSE=1 bpython -c 'pass' 2>&1 | tee -a /tmp/bpython-startup-messages | gnomon
… the gnomon
output shows the time taken by each line issued by the verbose Python interpreter output. It looks like this:
… I have highlighted the line of output that took nearly twelve seconds to execute – by far the longest, as every single other line typically took a few nanoseconds, or at most a few μseconds, maybe.
Normally, if I encounter a wonky Python extension, I’ll recompile it myself, or otherwise tweak it from source, to properly render it unproblematic as necessary. But in this case, I am dealing with a c-extension module that is part of a larger Python standard-library module, all of which shipped with the Homebrew binary package (which is called a “bottle” in the Homebrew argot) that contained this version of Python.
Is this an issue anyone else can attest to? In particular, is it an issue anyone else sees when running Python under similar circumstances? And, most importantly, how might I fix it? Will I need to rebuild the entire Python installation, either using Homebrew or without it?
I have figured this out – the answer turns out to be simultaneously illuminating and embarrassing – and my solution may help others out when faced with similar circumstances.
In a nutshell: the aggravatingly long ~12s pauses I experienced while the Python interpreter loaded were being caused by having an excessively large amount of Python extension modules installed. It was not an issue with Python 2.7’s bundled xml.parsers.expat
module, nor with its C-API pyexpat
extension.
To wit: my use of the gnomon
tool, which furnished what appeared to be direct and straightforward evidence pointing to these modules, ended up misleading me in my conclusions as to where the problematic code was to be found.
After posting my question, I did a bit of additional forensic poking-around. By varying the Python code I was passing to the interpreter while invoking the command-line speed checks, I found that the gnomon
report would show the same twelve-plus-second halt, but at the incidence of different import
statements. Further, I found that some command variants (e.g. those executed using the pythonpy
CLT) weren’t plagued by the halting behavior at all.
I was able to pinpoint the lines of code responsible for the issues’ manifestation when I stumbled upon it by accident – while running my tests, the interminably long halts were no less annoying, and I ended up control-C’ing a number of tests in mid-halt. Those aborted test runs terminated with KeyboardInterrupt
exceptions, and the accompanying stacktrace output revealed the function in which things were dragging:
… the pkg_resources
module, when imported, walks each of the extensions directories named in sys.path
, enumerating each package in each extension, and subsequently reading in and then parsing all the associated metadata for all of those. Using any part of pkg_resources
(which itself is part of the essential setuptools
module) triggers this time-consuming action (which is then cached, at least, for the duration of that particular interpreter invocation’s lifetime). Depending on how your Python install is set up, and how you invoke your interpreter, you may or may not end up doing something to trigger the use of pkg_resources
, but it is in pretty wide use across Python extension packages, so chances are good that it’ll get triggered by something.
The actual function responsible for the actual loop that actually enumerates the packages is _initialize_master_working_set()
– it’s the one I’ve highlighted in the screenshot above. This is what all my KeyboardInterrupt
stacktraces revealed. From there, it was immediately evident that the frustrating halts were a steeply linear function of the number of Cheese Shop packages present (something I had been reckless with after upgrading my laptop).
I immediately proceeded to pip-uninstall roughly 50% of the extensions I had gratuitously installed, and then pared down another 40% or so by hoisting most of my actively-developed Python stuff into isolated virtualenv
project directories.
I felt pretty dumb afterward, as I had managed to cleverly mislead myself with my use of fancy analysis tools, and then found the actual solution by accident – one to a problem resulting from my own careless inattention, no less. Regardless, it is still something that could bite other Pythonic developers out there, and thus worth the writeup. You are hereby invited to learn from my circuitous adventures in issue-triage and diagnosis, indeed!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With