Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does separating my module into multiple files make it slower?

I made a Python module (swood) that, up until recently, was one large file with many classes. After refactoring related classes into separate files, everything still works, albeit around 50% slower. I assumed that, if anything, it would get a little faster because Python could more efficiently cache the bytecode for each file, improving the startup time.

I am running this code with CPython (haven't tested with PyPy and its ilk). I've run line_profiler on the old and refactored versions and the percentage of processing time spent on each line looks roughly the same before and after the refactor.

Here are some things about my program that might have something to do with it:

  • It makes a lot of small classes like Note and instantiating these might be expensive, though this wasn't a problem before the refactor.
  • When making these classes it gets them from a separate file it imports at the beginning.
  • There is a lot of numpy-based array manipulation happening in the part that takes longest (scaling and mixing audio)
  • I have a cache that I store the scaled notes in, if they are used more than three times in 7.5 seconds. (code)

What is causing my code to get slower after doing nothing but separating it into multiple files?

like image 363
MilkeyMouse Avatar asked May 13 '16 06:05

MilkeyMouse


People also ask

When to split program into multiple files?

Splitting your code into multiple files is a great way to produce smaller, more focused files. Navigating theses smaller files will be easier and so will understanding the content of each of these files.

Why do we split code into multiple functions state principles?

Dividing your code into separate functions makes your code much easier to work with. First, when you look for the part of code that performs a particular task, it will be easier to find if it's in its own function.


1 Answers

After some more benchmarking it was one of the things I suspected: having to access the functions/classes from another module meant another lookup for the Python interpreter, and the slight slowdown added up in some tight loops. The Python wiki has something about this, too:

Avoiding dots...

Suppose you can't use map or a list comprehension? You may be stuck with the for loop. The for loop example has another inefficiency. Both newlist.append and word.upper are function references that are reevaluated each time through the loop. The original loop can be replaced with:

upper = str.upper
newlist = []
append = newlist.append
for word in oldlist:
    append(upper(word))
like image 67
MilkeyMouse Avatar answered Oct 19 '22 23:10

MilkeyMouse