Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating Combinations using Django ORM (CROSS JOIN)

I've got three related models: Process, Factor and Level. A Process has a many-to-many relationship with Factors and a Factor will have one or more Levels. I'm attempting to calculate all combinations of Levels involved with a Process. This is straightforward to implement with Python's itertools as a model method, but is somewhat slow in execution speed, so I'm trying to figure out how to use the Django ORM to perform this calculation in SQL.

Models:

class Process(models.Model):
    factors = models.ManyToManyField(Factor, blank = True)

class Factor(models.Model):
    ...

class Level(models.Model):
    factor = models.ForeignKey(Factor, on_delete=models.CASCADE)

Example: A process 'Running' involves three Factors ('Distance', 'Climb', 'Surface') each composed of a number of Levels ('Long'/'Short', 'Flat'/'Hilly', 'Road'/'Mixed'/'Trail'). Calculating the combinations in SQL would involve building a query by first determing how many Factors were involved (3 in this example) and performing a CROSS JOIN of all the levels that many times.

In SQL, this could be accomplished as such:

WITH foo AS
    (SELECT * FROM Level
     WHERE Level.factor_id IN
        (SELECT ProcessFactors.factor_id FROM ProcessFactors WHERE process_id = 1)
    )
SELECT a1.*, a2.*, a3.*
    FROM foo a1
    CROSS JOIN foo a2
    CROSS JOIN foo a3
WHERE (a1.factor_id < a2.factor_id) AND (a2.factor_id < a3.factor_id)

Result:

a1.name | a2.name | a3.name
--------------------------
Long    | Flat    | Road
Long    | Flat    | Mixed
Long    | Flat    | Trail
Long    | Hilly   | Road
Long    | Hilly   | Mixed
Long    | Hilly   | Trail
Short   | Flat    | Road
Short   | Flat    | Mixed
Short   | Flat    | Trail
Short   | Hilly   | Road
Short   | Hilly   | Mixed
Short   | Hilly   | Trail

Currently, I have this implemented as a method on the Process model as:

def level_combinations(self):
    levels = []
    for factor in self.factors.all():
        levels.append(Level.objects.filter(factor = factor))
    
    combinations = []
    for levels in itertools.product(*levels):
        combination = {}
        
        combination["levels"] = levels
        
        combinations.append(combination)
    
    return combinations

Is this possible using the Django ORM or is it complex enough that it should be implemented as a raw query to improve speed over the Python code implementation?

There was a similiar question about performing CROSS JOIN in Django ORM from several years ago (approx. Django v1.3 it looks like) didn't attract appear to attract much attention (the author punted to just using Python itertools).

like image 572
Linville Avatar asked Sep 26 '22 14:09

Linville


2 Answers

from itertools import groupby, product
    
def level_combinations(self):
    # We need order by factor_id for proper grouping
    levels = Level.objects.filter(factor__process=self).order_by('factor_id')
    # [{'name': 'Long', 'factor_id': 1, ...},
    #  {'name': 'Short', 'factor_id': 1, ...},
    #  {'name': 'Flat', 'factor_id': 2, ...},
    #  {'name': 'Hilly', 'factor_id': 2, ...}]

    groups = [list(group) for _, group in groupby(levels, lambda l: l.factor_id)]
    # [[{'name': 'Long', 'factor_id': 1, ...},
    #   {'name': 'Short', 'factor_id': 1, ...}],
    #  [{'name': 'Flat', 'factor_id': 2, ...},
    #   {'name': 'Hilly', 'factor_id': 2, ...}]]

    # Note: don't forget, that product is iterator/generator, not list
    return product(*groups)

If order doesn't matter, then:

def level_combinations(self):
    levels = Level.objects.filter(factor__process=self)
    groups = {}
    for level in levels:
        groups.setdefault(level.factor_id, []).append(level)
    return product(*groups.values())
like image 93
5 revs, 2 users 72% Avatar answered Oct 11 '22 16:10

5 revs, 2 users 72%


A few years late, this workaround does not actually use a CROSS JOIN, but it does produce the desired result in a single query.

Step 1: add a cross field to your Factor model

class Factor(models.Model):
    cross = models.ForeignKey(
        to='self', on_delete=models.CASCADE, null=True, blank=True)
    ...

Step 2: link 'Climb' to 'Surface', and link 'Distance' to 'Climb', using the new Factor.cross field

Step 3: query as follows

Level.objects.filter(factor__name='Distance').values_list(
    'name', 'factor__cross__level__name', 'factor__cross__cross__level__name')

Result:

('Long', 'Flat', 'Road')
('Long', 'Flat', 'Mixed')
('Long', 'Flat', 'Trail')
('Long', 'Hilly', 'Road')
('Long', 'Hilly', 'Mixed')
('Long', 'Hilly', 'Trail')
('Short', 'Flat', 'Road')
('Short', 'Flat', 'Mixed')
('Short', 'Flat', 'Trail')
('Short', 'Hilly', 'Road')
('Short', 'Hilly', 'Mixed')
('Short', 'Hilly', 'Trail')

This is a simplified example. To make it more generic, instead of adding the Factor.cross field, you could add a new CrossedFactors model with two foreignkeys to Factor. This model can then be used to define various experimental designs.

like image 37
djvg Avatar answered Oct 11 '22 16:10

djvg