Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating a hook to a frequently accessed object

Tags:

python

I have an application which relies heavily on a Context instance that serves as the access point to the context in which a given calculation is performed.

If I want to provide access to the Context instance, I can:

  1. rely on global
  2. pass the Context as a parameter to all the functions that require it

I would rather not use global variables, and passing the Context instance to all the functions is cumbersome and verbose.

How would you "hide, but make accessible" the calculation Context?

For example, imagine that Context simply computes the state (position and velocity) of planets according to different data.

class Context(object):
 def state(self, planet, epoch):
  """base class --- suppose `state` is meant
     to return a tuple of vectors."""
  raise NotImplementedError("provide an implementation!")

class DE405Context(Context):
"""Concrete context using DE405 planetary ephemeris"""
 def state(self, planet, epoch):
   """suppose that de405 reader exists and can provide
      the required (position, velocity) tuple."""
   return de405reader(planet, epoch)

def angular_momentum(planet, epoch, context):
 """suppose we care about the angular momentum of the planet,
    and that `cross` exists"""
 r, v = context.state(planet, epoch)
 return cross(r, v)

# a second alternative, a "Calculator" class that contains the context
class Calculator(object):

 def __init__(self, context):
  self._ctx = context

 def angular_momentum(self, planet, epoch):
  r, v = self._ctx.state(planet, epoch)
  return cross(r, v)

# use as follows:
my_context = DE405Context()
now = now() # assume this function returns an epoch
# first case:
print angular_momentum("Saturn", now, my_context)
# second case:
calculator = Calculator(my_context)
print calculator.angular_momentum("Saturn", now) 

Of course, I could add all the operations directly into "Context", but it does not feel right.

In real life, the Context not only computes positions of planets! It computes many more things, and it serves as the access point to a lot of data.

So, to make my question more succinct: how do you deal with objects which need to be accessed by many classes?

I am currently exploring: python's context manager, but without much luck. I also thought about dynamically adding a property "context" to all functions directly (functions are objects, so they can have an access point to arbitrary objects), i.e.:

def angular_momentum(self, planet, epoch):
 r, v = angular_momentum.ctx.state(planet, epoch)
 return cross(r, v)

# somewhere before calling anything...
import angular_momentum
angular_momentum.ctx = my_context

edit

Something that would be great, is to create a "calculation context" with a with statement, for example:

 with my_context:
  h = angular_momentum("Earth", now)

Of course, I can already do that if I simply write:

 with my_context as ctx:
  h = angular_momentum("Earth", now, ctx) # first implementation above

Maybe a variation of this with the Strategy pattern?

like image 494
Escualo Avatar asked Jan 30 '26 20:01

Escualo


2 Answers

You generally don't want to "hide" anything in Python. You may want to signal human readers that they should treat it as "private", but this really just means "you should be able to understand my API even if you ignore this object", not "you can't access this".

The idiomatic way to do that in Python is to prefix it with an underscore—and, if your module might ever be used with from foo import *, add an explicit __all__ global that lists all the public exports. Again, neither of these will actually prevent anyone from seeing your variable, or even accessing it from outside after import foo.

See PEP 8 on Global Variable Names for more details.

Some style guides suggest special prefixes, all-caps-names, or other special distinguishing marks for globals, but PEP 8 specifically says that the conventions are the same, except for the __all__ and/or leading underscore.

Meanwhile, the behavior you want is clearly that of a global variable—a single object that everyone implicitly shares and references. Trying to disguise it as anything other than what it is will do you no good, except possibly for passing a lint check or a code review that you shouldn't have passed. All of the problems with global variables come from being a single object that everyone implicitly shares and references, not from being directly in the globals() dictionary or anything like that, so any decent fake global is just as bad as a real global. If that truly is the behavior you want, make it a global variable.

Putting it together:

# do not include _context here
__all__ = ['Context', 'DE405Context', 'Calculator', …

_context = Context()

Also, of course, you may want to call it something like _global_context or even _private_global_context, instead of just _context.

But keep in mind that globals are still members of a module, not of the entire universe, so even a public context will still be scoped as foo.context when client code does an import foo. And this may be exactly what you want. If you want a way for client scripts to import your module and then control its behavior, maybe foo.context = foo.Context(…) is exactly the right way. Of course this won't work in multithreaded (or gevent/coroutine/etc.) code, and it's inappropriate in various other cases, but if that's not an issue, in some cases, this is fine.

Since you brought up multithreading in your comments: In the simple style of multithreading where you have long-running jobs, the global style actually works perfectly fine, with a trivial change—replace the global Context with a global threading.local instance that contains a Context. Even in the style where you have small jobs handled by a thread pool, it's not much more complicated. You attach a context to each job, and then when a worker pulls a job off the queue, it sets the thread-local context to that job's context.

However, I'm not sure multithreading is going to be a good fit for your app anyway. Multithreading is great in Python when your tasks occasionally have to block for IO and you want to be able to do that without stopping other tasks—but, thanks to the GIL, it's nearly useless for parallelizing CPU work, and it sounds like that's what you're looking for. Multiprocessing (whether via the multiprocessing module or otherwise) may be more of what you're after. And with separate processes, keeping separate contexts is even simpler. (Or, you can write thread-based code and switch it to multiprocessing, leaving the threading.local variables as-is and only changing the way you spawn new tasks, and everything still works just fine.)

It may make sense to provide a "context" in the context manager sense, as an external version of the standard library's decimal module did, so someone can write:

with foo.Context(…):
    # do stuff under custom context
# back to default context

However, nobody could really think of a good use case for that (especially since, at least in the naive implementation, it doesn't actually solve the threading/etc. problem), so it wasn't added to the standard library, and you may not need it either.

If you want to do this, it's pretty trivial. If you're using a private global, just add this to your Context class:

def __enter__(self):
    global _context
    self._stashedcontext = _context
    _context = self
def __exit__(self, *args):
    global context
    _context = self._stashedcontext

And it should be obvious how to adjust this to public, thread-local, etc. alternatives.

Another alternative is to make everything a member of the Context object. The top-level module functions then just delegate to the global context, which has a reasonable default value. This is exactly how the standard library random module works—you can create a random.Random() and call randrange on it, or you can just call random.randrange(), which calls the same thing on a global default random.Random() object.

If creating a Context is too heavy to do at import time, especially if it might not get used (because nobody might ever call the global functions), you can use the singleton pattern to create it on first access. But that's rarely necessary. And when it's not, the code is trivial. For example, the source to random, starting at line 881, does this:

_inst = Random()
seed = _inst.seed
random = _inst.random
uniform = _inst.uniform
…

And that's all there is to it.

And finally, as you suggested, you could make everything a member of a different Calculator object which owns a Context object. This is the traditional OOP solution; overusing it tends to make Python feel like Java, but using it when it's appropriate is not a bad thing.

like image 110
abarnert Avatar answered Feb 01 '26 12:02

abarnert


You might consider using a proxy object, here's a library that helps in creating object proxies:

http://pypi.python.org/pypi/ProxyTypes

Flask uses object proxies for it's "current_app", "request" and other variables, all it takes to reference them is:

from flask import request

You could create a proxy object that is a reference to your real context, and use thread locals to manage the instances (if that would work for you).

like image 36
Brian Dilley Avatar answered Feb 01 '26 13:02

Brian Dilley



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!