So I have a booking system. Agents (the people and organisations submitting bookings) are only allowed to make booking in the categories we assign them. Many agents can assign to the same categories. It's a simple many-to-many. Here's an idea of what the models look like:
class Category(models.Model):
pass
class Agent(models.Model):
categories = models.ManyToManyField('Category')
class Booking(models.Model):
agent = models.ForeignKey('Agent')
category = models.ForeignKey('Category')
So when a booking comes in, we dynamically allocate the category based on which are available to the agent. The agent usually doesn't specify.
We have just noticed that —by the grace of a silly admin mistake— some agents were allowed submit Bookings to any category. It has left us with thousands of bookings in the wrong place.
I can fix this but the I can only get it to work by nesting lookups:
for agent in Agent.objects.all():
for booking in Booking.objects.filter(agent=agent):
if booking.category not in agent.categories.all():
# go through the automated allocation logic again
This works but it's super-slow. It's a lot of data flying between database and Django. This isn't a one-off either. I want to periodically audit new bookings to make sure they are in the correct place. It doesn't seem impossible that another admin issue will occur so after checking the Agent database, I want to query for Bookings that aren't in their agent's categories.
Again, nested queries will work not but as our datasets grow into millions (and beyond) I'd like to do this more efficiently..
I feel like it should be possible to do this with a F()
lookup, something like this:
from django.db.models import F
bad = Booking.objects.exclude(category__in=F('agent__categories'))
But this doesn't work: TypeError: 'Col' object is not iterable
I've also tried .exclude(category=F('agent__categories'))
and while it's happier with the syntax there, it doesn't exclude the "correct" bookings.
What's the secret formula for doing this sort of F()
query on a M2M?
To help nail down exactly what I'm after I've set up a Github repo with these models (and some data). Please use them to write the query. The current sole answer hits and issue I was seeing on my "real" data too.
git clone https://github.com/oliwarner/djangorelquerytest.git
cd djangorelquerytest
python3 -m venv venv
. ./venv/bin/activate
pip install ipython Django==1.9a1
./manage.py migrate
./manage.py shell
And in the shell, fire in:
from django.db.models import F
from querytest.models import Category, Agent, Booking
Booking.objects.exclude(agent__categories=F('category'))
Is that a bug? Is there a proper way to achieve this?
There is a chance that I might be wrong, but I think doing it in reverse should do the trick:
bad = Booking.objects.exclude(agent__categories=F('category'))
Edit
If above won't work, here is another idea. I've tried similar logic on the setup I have and it seems to work. Try adding an intermediate model for ManyToManyField
:
class Category(models.Model):
pass
class Agent(models.Model):
categories = models.ManyToManyField('Category', through='AgentCategory')
class AgentCategory(models.Model):
agent = models.ForeignKey(Agent, related_name='agent_category_set')
category = models.ForeignKey(Category, related_name='agent_category_set')
class Booking(models.Model):
agent = models.ForeignKey('Agent')
category = models.ForeignKey('Category')
Then you can do a query:
bad = Booking.objects.exclude(agent_category_set__category=F('category'))
Of course specifying an intermediate model has it's own implications, but I am sure you can handle them.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With