I have a large number of sets of data. Each set of data comprises of several database tables. The schema for the sets of database tables is identical. Each set of tables can have over a million rows. Each set of data belongs to one job, there are no relations between jobs. One or more jobs belong to a different user. Sets of tables get imported and eventually deleted as a set of tables. From a performance point of view it is better to keep them as separate sets of tables.
So I would like to have several generic Django models one for each of the several tables. I have achieved it in my views.py file by using code similar to this:
from foobar.models import Foo, Bar
def my_view(request):
prefix = request.GET.get('prefix')
Foo._meta.db_table = prefix + '_foo'
Bar._meta.db_table = prefix + '_bar'
....
foobar_list = Foo.objects.filter(bar_id=myval)
...
My questions are: Is it safe to use this code with concurrent multiple users of a Django based web application? Are the models objects shared across users? What would happen if there were two requests simultaneously?
EDIT NO 2: I have considered Lie Ryan's answer and the comments and come up with this code:
from django.http import HttpResponse, HttpResponseNotFound
from django.db import models
from django.template import RequestContext, loader
def getModels(prefix):
table_map = {}
table_map["foo"] = type(str(prefix + '_foo'), (models.Model,), {
'__module__': 'foobar.models',
'id' : models.IntegerField(primary_key=True),
'foo' : models.TextField(blank=True),
})
table_map["foo"]._meta.db_table = prefix + '_foo'
table_map["bar"] = type(str(prefix + '_bar'), (models.Model,), {
'__module__': 'foobar.models',
'id' : models.IntegerField(primary_key=True),
'foo' : models.ForeignKey(prefix + '_foo', null=True, blank=True),
})
table_map["bar"]._meta.db_table = prefix + '_bar'
return table_map
def foobar_view(request):
prefix = request.GET.get('prefix')
if prefix != None and prefix.isdigit():
table_map = getModels(prefix)
foobar_list = table_map["bar"].objects.filter.order_by('foo__foo')
template = loader.get_template('foobar/foobar.html')
context = RequestContext(request, {
'foobar_list': foobar_list,
})
return HttpResponse(template.render(context))
else:
return HttpResponseNotFound('<h1>Page not found</h1>')
Now my question is, is this second draft of the edited code safe with concurrent multiple users?
This technique is called sharding. No, it is not safe to do this if you serve concurrent requests with threads.
What you can do is to dynamically construct multiple classes pointing to different db_tables, and use a factory to select the right class.
tables = ["foo", "bar"]
table_map = {}
for tbl in tables:
class T(models.Model):
db_table = tbl
... table definition ...
table_map[tbl] = T
And then create a function that selects the right table_map based on how you shard your data.
Also be careful of injection if you accept table name from user input.
Alternatively, some database systems like PostgrSQL allows multiple schemas per database, which might be a better way to separate your data in certain circumstances.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With