Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django ManyToMany through with multiple databases

TLTR: Django does not include database names in SQL queries, can I somehow force it to do this or is there a workaround?

The long version:

I have two legacy MySQL databases (Note: I have no influence on the DB layout) for which I'm creating a readonly API using DRF on Django 1.11 and python 3.6

I'm working around the referential integrity limitation of MyISAM DBs by using the SpanningForeignKey field suggested here: https://stackoverflow.com/a/32078727/7933618

I'm trying to connect a table from DB1 to a table from DB2 via a ManyToMany through table on DB1. That's the query Django is creating:

SELECT "table_b"."id" FROM "table_b" INNER JOIN "throughtable" ON ("table_b"."id" = "throughtable"."b_id") WHERE "throughtable"."b_id" = 12345

Which of course gives me an Error "Table 'DB2.throughtable' doesn't exist" because throughtable is on DB1 and I have no idea how to force Django to prefix the tables with the DB name. The query should be:

SELECT table_b.id FROM DB2.table_b INNER JOIN DB1.throughtable ON (table_b.id = throughtable.b_id) WHERE throughtable.b_id = 12345

Models for app1 db1_app/models.py: (DB1)

class TableA(models.Model):
    id = models.AutoField(primary_key=True)
    # some other fields
    relations = models.ManyToManyField(TableB, through='Throughtable')

class Throughtable(models.Model):
    id = models.AutoField(primary_key=True)
    a_id = models.ForeignKey(TableA, to_field='id')
    b_id = SpanningForeignKey(TableB, db_constraint=False, to_field='id')

Models for app2 db2_app/models.py: (DB2)

class TableB(models.Model):
    id = models.AutoField(primary_key=True)
    # some other fields

Database router:

def db_for_read(self, model, **hints):
    if model._meta.app_label == 'db1_app':
        return 'DB1'

    if model._meta.app_label == 'db2_app':
        return 'DB2'

    return None

Can I force Django to include the database name in the query? Or is there any workaround for this?

like image 444
masterfloda Avatar asked Jul 10 '17 17:07

masterfloda


1 Answers

I have a similar setup with PostgreSQL. Utilizing search_path to make cross-schema references possible in Django (schema in postgres = database in mysql). Unfortunately, seems like MySQL doesn't have such a mechanism.

However, you might try your luck creating views for it. Make views in one databases that references other databases, use it to select data. I think it's the best option since you want your data read-only anyway.

It's however not a perfect solution, executing raw queries might be more useful in some cases.


UPD: Providing mode details about my setup with PostgreSQL (as requested by bounty later). I couldn't find anything like search_path in MySQL documentation.

Quick intro

PostgreSQL has Schemas. They are synonymous to MySQL databases. So if you are MySQL user, imaginatively replace word "schema" with word "database". Requests can join tables between schemas, create foreign keys, etc... Each user (role) has a search_path:

This variable [search_path] specifies the order in which schemas are searched when an object (table, data type, function, etc.) is referenced by a simple name with no schema specified.

Special attention on "no schema specified", because that's exactly what Django does.

Example: Legacy databases

Let's say we got coupe legacy schemas, and since we are not allowed to modify them, we also want one new schema to store the NM relation in it.

  • old1 is the first legacy schema, it has old1_table (which is also the model name, for convenience sake)
  • old2 is the second legacy schema, it has old2_table
  • django_schema is a new one, it will store the required NM relation

All we need to do is:

alter role django_user set search_path = django_schema, old1, old2;

This is it. Yes, that simple. Django has no names of the schemas ("databases") specified anywhere. Django actually has no idea what is going on, everything is managed by PostgreSQL behind the scenes. Since django_schema is first in the list, new tables will be created there. So the following code ->

class Throughtable(models.Model):
    a_id = models.ForeignKey('old1_table', ...)
    b_id = models.ForeignKey('old2_table', ...)

-> will result in a migration that creates table throughtable that references old1_table and old2_table.

Problems: if you happened to have several tables with same names, you will either need to rename them or still trick Django into using a dot inside of table names.

like image 137
Art Avatar answered Oct 04 '22 15:10

Art