Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Control python imports to reduce size and overhead

I have created a number of personal libraries to help with my daily coding. Best practice is to put imports at the beginning of python programs. But say I import my library, or even just a function or class from the library. All of the modules are imported (even if those modules are used in other unused classes or functions). I assume this increases the overhead of the program?

One example. I have a library called pytools which looks something like this

import difflib

def foo():
    # uses difflib.SequenceMatcher

def bar():
    # benign function ie
    print "Hello!"
    return True

class foobar:
    def __init__():
        print "New foobar"
    def ret_true():
        return True

The function foo uses difflib. Now say I am writing a new program that needs to use bar and foobar. I could either write

import pytools
...
item = pytools.foobar()
vals = pytools.bar()

or I could do

from pytools import foobar, bar
...
item = foobar()
vals = bar()

Does either choice reduce overhead or preclude the import of foo and its dependencies on difflib? What if the import to difflib was inside of the foo function?

The problem I am running into is when converting simple programs into executables that only use one or two classes or functions from my libraries, The executable ends up being 50 mb or so.

I have read through py2exe's optimizing size page and can optimize using some of its suggestions.

http://www.py2exe.org/index.cgi/OptimizingSize

I guess I am really asking for best practice here. Is there some way to preclude the import of libraries whose dependencies are in unused functions or classes? I've watched import statements execute using a debugger and it appears that python only "picks up" the line with "def somefunction" before moving on. Is the rest of the import not completed until the function/class is used? This would mean putting high volume imports at the beginning of a function or class could reduce overhead for the rest of the library.

like image 552
Paul Seeb Avatar asked Jun 22 '12 15:06

Paul Seeb


People also ask

Do imports slow down Python?

Startup and Module Importing Overhead. Starting a Python interpreter and importing Python modules is relatively slow if you care about milliseconds. If you need to start hundreds or thousands of Python processes as part of a workload, this overhead will amount to several seconds of overhead.

Does Python optimize imports?

It's often useful to place them inside functions to restrict their visibility and/or reduce initial startup time. Although Python's interpreter is optimized to not import the same module multiple times, repeatedly executing an import statement can seriously affect performance in some circumstances.

What does Importlib do Python?

The importlib package provides the implementation of the import statement in Python source code portable to any Python interpreter. This also provides an implementation which is easier to comprehend than one implemented in a programming language other than Python.

What is __ loader __ in Python?

__loader__ is an attribute that is set on an imported module by its loader. Accessing it should return the loader object itself. In Python versions before 3.3, __loader__ was not set by the built-in import machinery. Instead, this attribute was only available on modules that were imported using a custom loader.


1 Answers

The only way to effectively reduce your dependencies is to split your tool box into smaller modules, and to only import the modules you need.

Putting imports at the beginning of unused functions will prevent loading these modules at run-time, but is discouraged because it hides the dependecies. Moreover, your Python-to-executable converter will likely need to include these modules anyway, since Python's dynamic nature makes it impossible to statically determine which functions are actually called.

like image 76
Sven Marnach Avatar answered Sep 28 '22 07:09

Sven Marnach