Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lazy module variables--can it be done?

I'm trying to find a way to lazily load a module-level variable.

Specifically, I've written a tiny Python library to talk to iTunes, and I want to have a DOWNLOAD_FOLDER_PATH module variable. Unfortunately, iTunes won't tell you where its download folder is, so I've written a function that grabs the filepath of a few podcast tracks and climbs back up the directory tree until it finds the "Downloads" directory.

This takes a second or two, so I'd like to have it evaluated lazily, rather than at module import time.

Is there any way to lazily assign a module variable when it's first accessed or will I have to rely on a function?

like image 714
wbg Avatar asked Sep 22 '09 22:09

wbg


People also ask

Can Python modules have variables?

Each Module/file has its own global variable When you create a variable (not within a function) in a Python file, you mount this variable to the namespace of the current module. Every command in this Python file can access, read, and modify the value of the variable, that is, it becomes a global variable.

What is lazy import python?

Lazy import is a very useful feature of the Pyforest library as this feature automatically imports the library for us, if we don't use the library it won't be added. This feature is very useful to those who don't want to write the import statements again and again in their code.

How do I know if angular lazy loading is working?

If you want to check how lazy loading works and how lazy loading routing flow, then Augury is the best tool we have. Click on ctrl+F12 to enable the debugger and click on the Augury tab. Click on the router tree. Here, it will show the route flow of our modules.


8 Answers

You can't do it with modules, but you can disguise a class "as if" it was a module, e.g., in itun.py, code...:

import sys

class _Sneaky(object):
  def __init__(self):
    self.download = None

  @property
  def DOWNLOAD_PATH(self):
    if not self.download:
      self.download = heavyComputations()
    return self.download

  def __getattr__(self, name):
    return globals()[name]

# other parts of itun that you WANT to code in
# module-ish ways

sys.modules[__name__] = _Sneaky()

Now anybody can import itun... and get in fact your itun._Sneaky() instance. The __getattr__ is there to let you access anything else in itun.py that may be more convenient for you to code as a top-level module object, than inside _Sneaky!_)

like image 190
Alex Martelli Avatar answered Oct 02 '22 09:10

Alex Martelli


It turns out that as of Python 3.7, it's possible to do this cleanly by defining a __getattr__() at the module level, as specified in PEP 562 and documented in the data model chapter in the Python reference documentation.

# mymodule.py

from typing import Any

DOWNLOAD_FOLDER_PATH: str

def _download_folder_path() -> str:
    global DOWNLOAD_FOLDER_PATH
    DOWNLOAD_FOLDER_PATH = ... # compute however ...
    return DOWNLOAD_FOLDER_PATH

def __getattr__(name: str) -> Any:
    if name == "DOWNLOAD_FOLDER_PATH":
        return _download_folder_path()
    raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
like image 22
Dan Walters Avatar answered Sep 30 '22 09:09

Dan Walters


I used Alex' implementation on Python 3.3, but this crashes miserably: The code

  def __getattr__(self, name):
    return globals()[name]

is not correct because an AttributeError should be raised, not a KeyError. This crashed immediately under Python 3.3, because a lot of introspection is done during the import, looking for attributes like __path__, __loader__ etc.

Here is the version that we use now in our project to allow for lazy imports in a module. The __init__ of the module is delayed until the first attribute access that has not a special name:

""" config.py """
# lazy initialization of this module to avoid circular import.
# the trick is to replace this module by an instance!
# modelled after a post from Alex Martelli :-)

Lazy module variables--can it be done?

class _Sneaky(object):
    def __init__(self, name):
        self.module = sys.modules[name]
        sys.modules[name] = self
        self.initializing = True

    def __getattr__(self, name):
        # call module.__init__ after import introspection is done
        if self.initializing and not name[:2] == '__' == name[-2:]:
            self.initializing = False
            __init__(self.module)
        return getattr(self.module, name)

_Sneaky(__name__)

The module now needs to define an init function. This function can be used to import modules that might import ourselves:

def __init__(module):
    ...
    # do something that imports config.py again
    ...

The code can be put into another module, and it can be extended with properties as in the examples above.

Maybe that is useful for somebody.

like image 34
Christian Tismer Avatar answered Oct 02 '22 09:10

Christian Tismer


The proper way of doing this, according to the Python docs, is to subclass types.ModuleType and then dynamically update the module's __class__. So, here's a solution loosely on Christian Tismer's answer but probably not resembling it much at all:

import sys
import types

class _Sneaky(types.ModuleType):
    @property
    def DOWNLOAD_FOLDER_PATH(self):
        if not hasattr(self, '_download_folder_path'):
            self._download_folder_path = '/dev/block/'
        return self._download_folder_path
sys.modules[__name__].__class__ = _Sneaky
like image 28
wizzwizz4 Avatar answered Oct 04 '22 09:10

wizzwizz4


Since Python 3.7 (and as a result of PEP-562), this is now possible with the module-level __getattr__:

Inside your module, put something like:

def _long_function():
    # print() function to show this is called only once
    print("Determining DOWNLOAD_FOLDER_PATH...")
    # Determine the module-level variable
    path = "/some/path/here"
    # Set the global (module scope)
    globals()['DOWNLOAD_FOLDER_PATH'] = path
    # ... and return it
    return path


def __getattr__(name):
    if name == "DOWNLOAD_FOLDER_PATH":
        return _long_function()

    # Implicit else
    raise AttributeError(f"module {__name__!r} has no attribute {name!r}")

From this it should be clear that the _long_function() isn't executed when you import your module, e.g.:

print("-- before import --")
import somemodule
print("-- after import --")

results in just:

-- before import --
-- after import --

But when you attempt to access the name from the module, the module-level __getattr__ will be called, which in turn will call _long_function, which will perform the long-running task, cache it as a module-level variable, and return the result back to the code that called it.

For example, with the first block above inside the module "somemodule.py", the following code:

import somemodule
print("--")
print(somemodule.DOWNLOAD_FOLDER_PATH)
print('--')
print(somemodule.DOWNLOAD_FOLDER_PATH)
print('--')

produces:

--
Determining DOWNLOAD_FOLDER_PATH...
/some/path/here
--
/some/path/here
--

or, more clearly:

# LINE OF CODE                                # OUTPUT
import somemodule                             # (nothing)

print("--")                                   # --

print(somemodule.DOWNLOAD_FOLDER_PATH)        # Determining DOWNLOAD_FOLDER_PATH...
                                              # /some/path/here

print("--")                                   # --

print(somemodule.DOWNLOAD_FOLDER_PATH)        # /some/path/here

print("--")                                   # --

Lastly, you can also implement __dir__ as the PEP describes if you want to indicate (e.g. to code introspection tools) that DOWNLOAD_FOLDER_PATH is available.

like image 42
jedwards Avatar answered Oct 04 '22 09:10

jedwards


Is there any way to lazily assign a module variable when it's first accessed or will I have to rely on a function?

I think you are correct in saying that a function is the best solution to your problem here. I will give you a brief example to illustrate.

#myfile.py - an example module with some expensive module level code.

import os
# expensive operation to crawl up in directory structure

The expensive operation will be executed on import if it is at module level. There is not a way to stop this, short of lazily importing the entire module!!

#myfile2.py - a module with expensive code placed inside a function.

import os

def getdownloadsfolder(curdir=None):
    """a function that will search upward from the user's current directory
        to find the 'Downloads' folder."""
    # expensive operation now here.

You will be following best practice by using this method.

like image 28
Simon Edwards Avatar answered Oct 01 '22 09:10

Simon Edwards


Recently I came across the same problem, and have found a way to do it.

class LazyObject(object):
    def __init__(self):
        self.initialized = False
        setattr(self, 'data', None)

    def init(self, *args):
        #print 'initializing'
        pass

    def __len__(self): return len(self.data)
    def __repr__(self): return repr(self.data)

    def __getattribute__(self, key):
        if object.__getattribute__(self, 'initialized') == False:
            object.__getattribute__(self, 'init')(self)
            setattr(self, 'initialized', True)

        if key == 'data':
            return object.__getattribute__(self, 'data')
        else:
            try:
                return object.__getattribute__(self, 'data').__getattribute__(key)
            except AttributeError:
                return super(LazyObject, self).__getattribute__(key)

With this LazyObject, You can define a init method for the object, and the object will be initialized lazily, example code looks like:

o = LazyObject()
def slow_init(self):
    time.sleep(1) # simulate slow initialization
    self.data = 'done'
o.init = slow_init

the o object above will have exactly the same methods whatever 'done' object have, for example, you can do:

# o will be initialized, then apply the `len` method 
assert len(o) == 4

complete code with tests (works in 2.7) can be found here:

https://gist.github.com/observerss/007fedc5b74c74f3ea08

like image 39
jingchao Avatar answered Oct 01 '22 09:10

jingchao


If that variable lived in a class rather than a module, then you could overload getattr, or better yet, populate it in init.

like image 41
Sean Cavanagh Avatar answered Oct 02 '22 09:10

Sean Cavanagh