Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where to instantiate boto s3 client so it is reused during a request?

I'm wondering where the best place to instantiate a boto3 s3 client is so that it can be reused during the duration of a request in django.

I have a django model with a computed property that returns a signed s3 url:

@property
def url(self):
    client = boto3.client('s3')
    params = {
        'Bucket': settings.BUCKET,
        'Key': self.frame.s3_key,
        'VersionId': self.key
    }
    return client.generate_presigned_url('get_object', Params=params)

The object is serialized as json and returned in a list that can contain 100's of these objects.

Even though boto3.client('s3') does not perform any network requests when instantiated, I've found that it is slow.

Placing S3_CLIENT = boto3.client('s3') into settings.py and then using that instead of instantiating a new client per object reduced the response time by ~3X with 100 results. However, I know it is bad practice to place global variables in settings.py

My question is where to instantiate this client so that is can be reused at least at the request level?

like image 790
Fingel Avatar asked Jan 21 '16 17:01

Fingel


1 Answers

If you use a lambda client, go with global. The client lets you reuse execution environments which has cost and performance savings

Take advantage of execution environment reuse to improve the performance of your function. Initialize SDK clients and database connections outside of the function handler

https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html

Otherwise I think this is a stylistic choice dependent on your app.

If your client will never change, global seems like a safe way to do it. The drawback is since it's a constant, you shouldn't change it during runtime. This has consequences, e.g. this makes changing Session hard. You could use a singleton but the code would become more verbose.

If you instantiate clients everywhere, you run the risk of making a client() call signature change a large effort, eg if you need to pass client('s3', verify=True), you'd have to add verify=True everywhere which is a pain. It's unlikely you'd do this though. The only param you're likely to override is config which you can pass through the session using set_default_config.

You could make it its own module, eg

foo.bar.aws.clients

session = None
ecs_client = None
eks_client = None

def init_session(new_session):
  session = new_session
  ecs_client = session.client('ecs')
  eks_client = session.client('eks')

You can call init_session from an appropriate place or have defaults and an import hook to auto instatiate. This file will get larger as you use more clients but at least the ugliness is contained. You could also do a hack like

def init_session(s):
  session = s
  clients = ['ecs', 'iam', 'eks', …]
  for c in clients:
    globals()[f'{c}_client'] = session.client(c)

The problem is the indirection that this hack adds, eg intelliJ is not smart enough to figure out where your clients came from and will say you are using an undefined variable.

like image 132
wonton Avatar answered Oct 18 '22 20:10

wonton