Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use sqlalchemy to select only one row from related table

Let's say I have an Author table and a Post table, and each Author can have several Posts.

Now, with a single sqlalchemy query, I want to get all of my active Authors and the most recent published Post for each.

I've been trying to go at this by getting a list of Posts that joinedload the Author, using a subquery to group the results together, like this:

subquery = DBSession.query(Author.id, func.max(Post.publish_date).label("publish_date")) \
    .join(Post.author) \
    .filter(Post.state == 'published') \
    .filter(Author.state == 'active') \
    .group_by(Author.id) \
    .subquery()

query = DBSession.query(Post) \
    .options(joinedload(Post.author)) \
    .join(Post.author) \
    .join(subquery, and_(Author.id == subquery.c.id, 
                         Post.publish_date == subquery.c.publish_date))

But if I have two Posts from an Author with the same publish_date, and those are the newest Posts, that means I get that Author appearing twice in my results list. And while I could use a second subquery to eliminate dupes (take func.max(Post.id)), it seems like really, really the wrong way to go about this. Is there a better way to go about this?

(Again, I'm looking for a single query, so I'm trying to avoid querying on the Author table, then looping through and doing a Post query for every Author in my results.)

like image 667
shroud Avatar asked Oct 16 '14 05:10

shroud


People also ask

Is there something better than SQLAlchemy?

Django, Pandas, Entity Framework, peewee, and MySQL are the most popular alternatives and competitors to SQLAlchemy.

What is subquery in SQLAlchemy?

The statement ends by calling subquery() , which tells SQLAlchemy that our intention for this query is to use it inside a bigger query instead of on its own.

What is scalar subquery SQLAlchemy?

A scalar subquery is a subquery that selects only one column or expression and returns one row. A scalar subquery can be used anywhere in an SQL query that a column or expression can be used.


1 Answers

I would do it as following:

LastPost = aliased(Post, name='last')
last_id = (
    session.query(LastPost.id)
    .filter(LastPost.author_id == Author.id)
    .order_by(LastPost.publish_date.desc())
    .order_by(LastPost.id.desc())
    .limit(1)
    .correlate(Author)
    .as_scalar()
)

query = (
    DBSession.query(Author, Post)
    .outerjoin(Post, Post.id == last_id)
)

for author, last_post in query:
    print(author, last_post)

As you can see, the result is a tuple of pairs (Author, LastPost).
Change outerjoin to join if you only want authors that have at least one Post.
Also, I do not preload any relationship Author.post to avoid any confusion.

like image 118
van Avatar answered Sep 21 '22 12:09

van