I'm trying to find a way to lazily load a module-level variable.
Specifically, I've written a tiny Python library to talk to iTunes, and I want to have a DOWNLOAD_FOLDER_PATH
module variable. Unfortunately, iTunes won't tell you where its download folder is, so I've written a function that grabs the filepath of a few podcast tracks and climbs back up the directory tree until it finds the "Downloads" directory.
This takes a second or two, so I'd like to have it evaluated lazily, rather than at module import time.
Is there any way to lazily assign a module variable when it's first accessed or will I have to rely on a function?
Each Module/file has its own global variable When you create a variable (not within a function) in a Python file, you mount this variable to the namespace of the current module. Every command in this Python file can access, read, and modify the value of the variable, that is, it becomes a global variable.
Lazy import is a very useful feature of the Pyforest library as this feature automatically imports the library for us, if we don't use the library it won't be added. This feature is very useful to those who don't want to write the import statements again and again in their code.
If you want to check how lazy loading works and how lazy loading routing flow, then Augury is the best tool we have. Click on ctrl+F12 to enable the debugger and click on the Augury tab. Click on the router tree. Here, it will show the route flow of our modules.
You can't do it with modules, but you can disguise a class "as if" it was a module, e.g., in itun.py
, code...:
import sys
class _Sneaky(object):
def __init__(self):
self.download = None
@property
def DOWNLOAD_PATH(self):
if not self.download:
self.download = heavyComputations()
return self.download
def __getattr__(self, name):
return globals()[name]
# other parts of itun that you WANT to code in
# module-ish ways
sys.modules[__name__] = _Sneaky()
Now anybody can import itun
... and get in fact your itun._Sneaky()
instance. The __getattr__
is there to let you access anything else in itun.py
that may be more convenient for you to code as a top-level module object, than inside _Sneaky
!_)
It turns out that as of Python 3.7, it's possible to do this cleanly by defining a __getattr__()
at the module level, as specified in PEP 562 and documented in the data model chapter in the Python reference documentation.
# mymodule.py
from typing import Any
DOWNLOAD_FOLDER_PATH: str
def _download_folder_path() -> str:
global DOWNLOAD_FOLDER_PATH
DOWNLOAD_FOLDER_PATH = ... # compute however ...
return DOWNLOAD_FOLDER_PATH
def __getattr__(name: str) -> Any:
if name == "DOWNLOAD_FOLDER_PATH":
return _download_folder_path()
raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
I used Alex' implementation on Python 3.3, but this crashes miserably: The code
def __getattr__(self, name):
return globals()[name]
is not correct because an AttributeError
should be raised, not a KeyError
.
This crashed immediately under Python 3.3, because a lot of introspection is done
during the import, looking for attributes like __path__
, __loader__
etc.
Here is the version that we use now in our project to allow for lazy imports
in a module. The __init__
of the module is delayed until the first attribute access
that has not a special name:
""" config.py """
# lazy initialization of this module to avoid circular import.
# the trick is to replace this module by an instance!
# modelled after a post from Alex Martelli :-)
Lazy module variables--can it be done?
class _Sneaky(object):
def __init__(self, name):
self.module = sys.modules[name]
sys.modules[name] = self
self.initializing = True
def __getattr__(self, name):
# call module.__init__ after import introspection is done
if self.initializing and not name[:2] == '__' == name[-2:]:
self.initializing = False
__init__(self.module)
return getattr(self.module, name)
_Sneaky(__name__)
The module now needs to define an init function. This function can be used to import modules that might import ourselves:
def __init__(module):
...
# do something that imports config.py again
...
The code can be put into another module, and it can be extended with properties as in the examples above.
Maybe that is useful for somebody.
The proper way of doing this, according to the Python docs, is to subclass types.ModuleType
and then dynamically update the module's __class__
. So, here's a solution loosely on Christian Tismer's answer but probably not resembling it much at all:
import sys
import types
class _Sneaky(types.ModuleType):
@property
def DOWNLOAD_FOLDER_PATH(self):
if not hasattr(self, '_download_folder_path'):
self._download_folder_path = '/dev/block/'
return self._download_folder_path
sys.modules[__name__].__class__ = _Sneaky
Since Python 3.7 (and as a result of PEP-562), this is now possible with the module-level __getattr__
:
Inside your module, put something like:
def _long_function():
# print() function to show this is called only once
print("Determining DOWNLOAD_FOLDER_PATH...")
# Determine the module-level variable
path = "/some/path/here"
# Set the global (module scope)
globals()['DOWNLOAD_FOLDER_PATH'] = path
# ... and return it
return path
def __getattr__(name):
if name == "DOWNLOAD_FOLDER_PATH":
return _long_function()
# Implicit else
raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
From this it should be clear that the _long_function()
isn't executed when you import your module, e.g.:
print("-- before import --")
import somemodule
print("-- after import --")
results in just:
-- before import -- -- after import --
But when you attempt to access the name from the module, the module-level __getattr__
will be called, which in turn will call _long_function
, which will perform the long-running task, cache it as a module-level variable, and return the result back to the code that called it.
For example, with the first block above inside the module "somemodule.py", the following code:
import somemodule
print("--")
print(somemodule.DOWNLOAD_FOLDER_PATH)
print('--')
print(somemodule.DOWNLOAD_FOLDER_PATH)
print('--')
produces:
-- Determining DOWNLOAD_FOLDER_PATH... /some/path/here -- /some/path/here --
or, more clearly:
# LINE OF CODE # OUTPUT
import somemodule # (nothing)
print("--") # --
print(somemodule.DOWNLOAD_FOLDER_PATH) # Determining DOWNLOAD_FOLDER_PATH...
# /some/path/here
print("--") # --
print(somemodule.DOWNLOAD_FOLDER_PATH) # /some/path/here
print("--") # --
Lastly, you can also implement __dir__
as the PEP describes if you want to indicate (e.g. to code introspection tools) that DOWNLOAD_FOLDER_PATH
is available.
Is there any way to lazily assign a module variable when it's first accessed or will I have to rely on a function?
I think you are correct in saying that a function is the best solution to your problem here. I will give you a brief example to illustrate.
#myfile.py - an example module with some expensive module level code.
import os
# expensive operation to crawl up in directory structure
The expensive operation will be executed on import if it is at module level. There is not a way to stop this, short of lazily importing the entire module!!
#myfile2.py - a module with expensive code placed inside a function.
import os
def getdownloadsfolder(curdir=None):
"""a function that will search upward from the user's current directory
to find the 'Downloads' folder."""
# expensive operation now here.
You will be following best practice by using this method.
Recently I came across the same problem, and have found a way to do it.
class LazyObject(object):
def __init__(self):
self.initialized = False
setattr(self, 'data', None)
def init(self, *args):
#print 'initializing'
pass
def __len__(self): return len(self.data)
def __repr__(self): return repr(self.data)
def __getattribute__(self, key):
if object.__getattribute__(self, 'initialized') == False:
object.__getattribute__(self, 'init')(self)
setattr(self, 'initialized', True)
if key == 'data':
return object.__getattribute__(self, 'data')
else:
try:
return object.__getattribute__(self, 'data').__getattribute__(key)
except AttributeError:
return super(LazyObject, self).__getattribute__(key)
With this LazyObject
, You can define a init
method for the object, and the object will be initialized lazily, example code looks like:
o = LazyObject()
def slow_init(self):
time.sleep(1) # simulate slow initialization
self.data = 'done'
o.init = slow_init
the o
object above will have exactly the same methods whatever 'done'
object have, for example, you can do:
# o will be initialized, then apply the `len` method
assert len(o) == 4
complete code with tests (works in 2.7) can be found here:
https://gist.github.com/observerss/007fedc5b74c74f3ea08
If that variable lived in a class rather than a module, then you could overload getattr, or better yet, populate it in init.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With