Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I define a SQLAlchemy relation representing the latest object in a collection?

I have a SQLAlchemy model with a one-to-many relationship between table x and table y. The record (if any) with the greatest id in table y where y.x_id = x.id is special. Class X and class Y map tables x and y.

I know how to define X.all_y (ORDER BY y.id). How do I define X.latest_y equivalent to X.all_y[-1]?

like image 443
joeforker Avatar asked Dec 18 '09 21:12

joeforker


1 Answers

the purely relational way to do it requires using a subquery to get the "latest" or "max" value, correlated to the parent, then equating that with the members of the collection. It means you'll get best results if you put an index on the column that determines "the latest":

from sqlalchemy import *
from sqlalchemy.orm import *

engine = create_engine('sqlite:///:memory:', echo='debug')

m = MetaData()

parent = Table('parent', m, 
                Column('id', Integer, primary_key=True)
)

child = Table('child', m, 
                Column('id', Integer, primary_key=True),
                Column('parent_id', Integer, ForeignKey('parent.id')),
                Column('sortkey', Integer)
                )

m.create_all(engine)

class Parent(object):
    def __init__(self, children):
        self.all_c = children

class Child(object):
    def __init__(self, sortkey):
        self.sortkey = sortkey

latest_c = select([func.max(child.c.sortkey)]).\
                where(child.c.parent_id==parent.c.id).\
                correlate(parent).\
                as_scalar()

mapper(Parent, parent, properties={
    'all_c':relation(Child),
    'latest_c':relation(Child, 
                            primaryjoin=and_(
                                child.c.sortkey==latest_c, 
                                child.c.parent_id==parent.c.id
                            ),
                            uselist=False
    )
})

mapper(Child, child)

session = sessionmaker(engine)()

p1, p2, p3 = Parent([Child('a'), Child('b'), Child('c')]), \
                Parent([Child('b'), Child('c')]),\
                Parent([Child('f'), Child('g'), Child('c')])

session.add_all([p1, p2, p3])
session.commit()

assert p1.latest_c.sortkey == 'c'
assert p2.latest_c.sortkey == 'c'
assert p3.latest_c.sortkey == 'g'

Alternatively, you can on some platforms use LIMIT, which can produce faster results since you avoid the aggregation and can join the collection item on its primary key:

latest_c = select([child.c.id]).\
                where(child.c.parent_id==parent.c.id).\
                order_by(child.c.sortkey.desc()).\
                limit(1).\
                correlate(parent).\
                as_scalar()

mapper(Parent, parent, properties={
    'all_c':relation(Child),
    'latest_c':relation(Child, 
                            primaryjoin=and_(
                                child.c.id==latest_c, 
                                child.c.parent_id==parent.c.id
                            ),
                            uselist=False
    )
})
like image 172
zzzeek Avatar answered Sep 21 '22 06:09

zzzeek