I'm looking for working code and ideas from others who have tried to build a multi-tenant Django application using database-level isolation.
Update/Solution: I ended solving this in a new opensource project: see django-db-multitenant
My goal is to multiplex requests as they come in to a single app server (WSGI frontend like gunicorn), based on the request hostname or request path (for instance, foo.example.com/
sets the Django connection to use database foo
, and bar.example.com/
uses database bar
).
I'm aware of a few existing solutions for multi tenancy in Django:
SET search_path
command to the db. Unfortunately, it is Postgres specific and I am stuck with MySQL.(id, tenant_id)
rather than (id)
. I've tried, and don't like, this approach for a number of reasons: it makes the application more complex, it can lead to hard-to-find bugs, and it provides no database-level isolation.My best idea so far is to do something like django-tenant-schemas
: in the first middleware, grab django.db.connection
and fiddle with the database selection rather than the schema. I haven't quite thought through what this means in terms of pooled/persistent connections
Another dead end I pursued was tenant-specific table prefixes: Setting aside that I'd need them to be dynamic, even a global table prefix is not easily achieved in Django (see rejected ticket 5000, among others).
Finally, Django multiple database support lets you define multiple named databases, and mux among them based on the instance type and read/write mode. Not helpful since there is no facility to select the db on a per-request basis.
Has anyone managed something similar? If so, how did you implement it?
Django's admin doesn't have any explicit support for multiple databases. If you want to provide an admin interface for a model on a database other than that specified by your router chain, you'll need to write custom ModelAdmin classes that will direct the admin to use a specific database for content.
A multi-tenant application (or multitenancy software) is an architecture in which a single instance serves multiples tenants. In other words, we host our application in a single place and all our customers (tenants) use the same resources.
What databases are supported by Django? PostgreSQL and MySQL, SQLite and Oracle. Apart from these, Django also supports databases such as ODBC, Microsoft SQL Server, IBM DB2, SAP SQL Anywhere, and Firebird using third-party packages. Note: Officially Django doesn't support any no-SQL databases.
Multitenant support for Django, using one tenant per database. Provides a simple multi-tenancy solution for Django based on the concept of having a single tenant per database. This application is still experimental, but is being used in production by the authors. Contributions and discussion are welcome.
The first step to using more than one database with Django is to tell Django about the database servers you’ll be using. This is done using the DATABASES setting. This setting maps database aliases, which are a way to refer to a specific database throughout Django, to a dictionary of settings for that specific connection.
For PostgreSQL, this application implements a semi-isolated approach , each tenant has its own schema and the connection details are shared via the public schema. django-db-multitenant makes it possible (even easy) to take a Django application designed for a single tenant and use it with multiple tenants.
The base router is used by Django’s database operations to allocate database usage. Whenever a query needs to know which database to use, it calls the base router, providing a model and a hint (if available). The base router tries each router class in turn until one returns a database suggestion.
I've done something similar that is closest to point 1, but instead of using middleware to set a default connection Django database routers are used. This allow application logic to use a number of databases if required for each request. It's up to the application logic to choose a suitable database for every query, and this is the big downside of this approach.
With this setup, all databases are listed in settings.DATABASES
, including databases which may be shared among customers. Each model that is customer specific is placed in a Django app that has a specific app label.
eg. The following class defines a model which exists in all customer databases.
class MyModel(Model): .... class Meta: app_label = 'customer_records' managed = False
A database router is placed in the settings.DATABASE_ROUTERS
chain to route database request by app_label
, something like this (not a full example):
class AppLabelRouter(object): def get_customer_db(self, model): # Route models belonging to 'myapp' to the 'shared_db' database, irrespective # of customer. if model._meta.app_label == 'myapp': return 'shared_db' if model._meta.app_label == 'customer_records': customer_db = thread_local_data.current_customer_db() if customer_db is not None: return customer_db raise Exception("No customer database selected") return None def db_for_read(self, model, **hints): return self.get_customer_db(model, **hints) def db_for_write(self, model, **hints): return self.get_customer_db(model, **hints)
The special part about this router is the thread_local_data.current_customer_db()
call. Before the router is exercised, the caller/application must have set up the current customer db in thread_local_data
. A Python context manager can be used for this purpose to push/pop a current customer database.
With all of this configured, the application code then looks something like this, where UseCustomerDatabase
is a context manager to push/pop a current customer database name into thread_local_data
so that thread_local_data.current_customer_db()
will return the correct database name when the router is eventually hit:
class MyView(DetailView): def get_object(self): db_name = determine_customer_db_to_use(self.request) with UseCustomerDatabase(db_name): return MyModel.object.get(pk=1)
This is quite a complex setup already. It works, but I'll try to summarize what I see see as advantages and disadvantages:
Advantages
USE db;
statement that guesses that all databases are accessible through a single connection.Disadvantages
Suggestions
If you want flexible database access, I'd suggest to use Django's database routers. Use Middleware or a view Mixin which automatically sets up a default database to use for the connection based on request parameters. You might have to resort to thread local data to store the default database to use so that when the router is hit, it knows which database to route to. This allows Django to use its existing persistent connections to a database (which may reside on different hosts if wanted), and chooses the database to use based on routing set up in the request.
This approach also has the advantage that the database for a query can be overridden if needed by using the QuerySet using()
function to select a database other than the default.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With