I have the following model schema in Django (with Postgres).
class A(Models.model):
related = models.ManyToManyField("self", null=True)
Given a QuerySet of A, I would like to return a dictionary mapping each instance of A
in the QuerySet to a list of id
s of its related
instances as quickly as possible.
I can surely iterate through each A and query the related field, but is there a more optimal way?
In Django, we can use the id__in query with a queryset to filter down a queryset based on a list of IDs. However, by default this will fail if your IDs are UUIDs.
By default, Django adds an id field to each model, which is used as the primary key for that model. You can create your own primary key field by adding the keyword arg primary_key=True to a field.
According you have Three instances. You can use the values_list
method to retrieve just the results and from this result get just the ID's of their related
instances.
I use the pk
field to be my filter because i don't know your scheme, but you can use anything, just must be a QuerySet
.
>>> result = A.objects.filter(pk=1)
>>> result.values('related__id')
[{'id': 2}, {'id': 3}]
>>> result.values_list('related__id')
[(2,), (3,)]
>>> result.values_list('related__id', flat=True)
[2, 3]
You can get pretty close like this:
qs = A.objects.prefetch_related(Prefetch(
'related',
queryset=A.objects.only('pk'),
to_attr='related_insts')).in_bulk(my_list_of_pks)
This will give a mapping from pks of the current object to the instance itself, so you can iterate through as follows:
for pk, inst in qs.iteritems():
related_ids = (related.pk for related in inst.related_insts)
Or given an instance, you can do a fast lookup like so:
related_ids = (related.pk for related in qs[instance.pk]).
This method maps the instance ids to the related ids (indirectly) since you specifically requested a dictionary. If you aren't doing lookups, you may want the following instead:
qs = A.objects.prefetch_related(Prefetch(
'related',
queryset=A.objects.only('pk'),
to_attr='related_insts')).filter(pk__in=my_list_of_pks)
for inst in qs:
related_ids = (related.pk for related in inst.related_insts)
You may take note of the use of only
to only pull the pks from the db. There is an open ticket to allow the use of values
and (I presume) values_list
in Prefetch queries. This would allow you to do the following.
qs = A.objects.prefetch_related(Prefetch(
'related',
queryset=A.objects.values_list('pk', flat=True),
to_attr='related_ids')).filter(pk__in=my_list_of_pks)
for inst in qs:
related_ids = inst.related_ids
You could of course optimize further, for example by using qs.only('related_insts')
on the primary queryset, but make sure you aren't doing anything with these instances-- they're essentially just expensive containers to hold your related_ids.
I believe this is the best that's available for now (without custom queries). To get to exactly what you want, two things are needed:
values_list
is made to work with Prefetch to_attr
like it does for annotations.With these two things in place (and continuing the above example) you could do the following to get exactly what you requested:
d = qs.values_list('related_ids', flat=True).in_bulk()
for pk, related_pks in d.items():
print 'Containing Objects %s' % pk
print 'Related objects %s' % related_pks
# And lookups
print 'Object %d has related objects %s' % (20, d[20])
I've left off some details explaining things, but it should be pretty clear from the documentation. If you need any clarification, don't hesitate!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With