I've got three related models: Process
, Factor
and Level
. A Process
has a many-to-many relationship with Factor
s and a Factor
will have one or more Level
s. I'm attempting to calculate all combinations of Level
s involved with a Process
. This is straightforward to implement with Python's itertools
as a model method, but is somewhat slow in execution speed, so I'm trying to figure out how to use the Django ORM to perform this calculation in SQL.
Models:
class Process(models.Model):
factors = models.ManyToManyField(Factor, blank = True)
class Factor(models.Model):
...
class Level(models.Model):
factor = models.ForeignKey(Factor, on_delete=models.CASCADE)
Example: A process 'Running'
involves three Factor
s ('Distance'
, 'Climb'
, 'Surface'
) each composed of a number of Level
s ('Long'
/'Short'
, 'Flat'
/'Hilly'
, 'Road'
/'Mixed'
/'Trail'
). Calculating the combinations in SQL would involve building a query by first determing how many Factor
s were involved (3 in this example) and performing a CROSS JOIN
of all the levels that many times.
In SQL, this could be accomplished as such:
WITH foo AS
(SELECT * FROM Level
WHERE Level.factor_id IN
(SELECT ProcessFactors.factor_id FROM ProcessFactors WHERE process_id = 1)
)
SELECT a1.*, a2.*, a3.*
FROM foo a1
CROSS JOIN foo a2
CROSS JOIN foo a3
WHERE (a1.factor_id < a2.factor_id) AND (a2.factor_id < a3.factor_id)
Result:
a1.name | a2.name | a3.name
--------------------------
Long | Flat | Road
Long | Flat | Mixed
Long | Flat | Trail
Long | Hilly | Road
Long | Hilly | Mixed
Long | Hilly | Trail
Short | Flat | Road
Short | Flat | Mixed
Short | Flat | Trail
Short | Hilly | Road
Short | Hilly | Mixed
Short | Hilly | Trail
Currently, I have this implemented as a method on the Process
model as:
def level_combinations(self):
levels = []
for factor in self.factors.all():
levels.append(Level.objects.filter(factor = factor))
combinations = []
for levels in itertools.product(*levels):
combination = {}
combination["levels"] = levels
combinations.append(combination)
return combinations
Is this possible using the Django ORM or is it complex enough that it should be implemented as a raw query to improve speed over the Python code implementation?
There was a similiar question about performing CROSS JOIN
in Django ORM from several years ago (approx. Django v1.3 it looks like) didn't attract appear to attract much attention (the author punted to just using Python itertools).
from itertools import groupby, product
def level_combinations(self):
# We need order by factor_id for proper grouping
levels = Level.objects.filter(factor__process=self).order_by('factor_id')
# [{'name': 'Long', 'factor_id': 1, ...},
# {'name': 'Short', 'factor_id': 1, ...},
# {'name': 'Flat', 'factor_id': 2, ...},
# {'name': 'Hilly', 'factor_id': 2, ...}]
groups = [list(group) for _, group in groupby(levels, lambda l: l.factor_id)]
# [[{'name': 'Long', 'factor_id': 1, ...},
# {'name': 'Short', 'factor_id': 1, ...}],
# [{'name': 'Flat', 'factor_id': 2, ...},
# {'name': 'Hilly', 'factor_id': 2, ...}]]
# Note: don't forget, that product is iterator/generator, not list
return product(*groups)
If order doesn't matter, then:
def level_combinations(self):
levels = Level.objects.filter(factor__process=self)
groups = {}
for level in levels:
groups.setdefault(level.factor_id, []).append(level)
return product(*groups.values())
A few years late, this workaround does not actually use a CROSS JOIN
, but it does produce the desired result in a single query.
Step 1: add a cross
field to your Factor
model
class Factor(models.Model):
cross = models.ForeignKey(
to='self', on_delete=models.CASCADE, null=True, blank=True)
...
Step 2: link 'Climb'
to 'Surface'
, and link 'Distance'
to 'Climb'
, using the new Factor.cross
field
Step 3: query as follows
Level.objects.filter(factor__name='Distance').values_list(
'name', 'factor__cross__level__name', 'factor__cross__cross__level__name')
Result:
('Long', 'Flat', 'Road')
('Long', 'Flat', 'Mixed')
('Long', 'Flat', 'Trail')
('Long', 'Hilly', 'Road')
('Long', 'Hilly', 'Mixed')
('Long', 'Hilly', 'Trail')
('Short', 'Flat', 'Road')
('Short', 'Flat', 'Mixed')
('Short', 'Flat', 'Trail')
('Short', 'Hilly', 'Road')
('Short', 'Hilly', 'Mixed')
('Short', 'Hilly', 'Trail')
This is a simplified example. To make it more generic, instead of adding the Factor.cross
field, you could add a new CrossedFactors
model with two foreignkeys to Factor
. This model can then be used to define various experimental designs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With