Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQLAlchemy: order_by(None) for joinedload subclause queries?

We are using SQLAlchemy 0.9.8 on Python 2.7.7 and Postgres 9.3.

We have a query that uses joinedloads to fully populate some Recipe objects using a single query. The query creates a large SQL statement that takes 20 seconds to execute - too long. Here's the rendered SQL statement on Pastebin.

The rendered SQL has an ORDER BY clause that Postgres explain says is the source of 99% of the time spent on this query. This appears to come from the relationship in the ORM model, which has an order_by clause.

However, we don't care about the order the results are returned for this query - we only care about the order when looking at a single object. If I remove the ORDER BY clause at the end of the rendered SQL statement, the query executes in less than a second - perfect.

We tried using .order_by(None) on the query, but that seems to have no effect. The ORDER BY seems to be related to the joinedloads, because if change the joinedloads to lazyloads, they go away. But we need the joinedloads for speed.

How can I get SQLAlchemy to omit the ORDER BY clauses?


FYI, here's the query:

missing_recipes = cls.query(session).filter(Recipe.id.in_(missing_recipe_ids)) if missing_recipe_ids else []

Here's an excerpt from the ORM class:

class Recipe(Base, TransactionalIdMixin, TableCacheMixin, TableCreatedModifiedMixin):
    __tablename__ = 'recipes'
      authors = relationship('RecipeAuthor', cascade=OrmCommonClass.OwnedChildCascadeOptions,
                           single_parent=True,
                           lazy='joined', order_by='RecipeAuthor.order', backref='recipe')
    scanned_photos = relationship(ScannedPhoto, backref='recipe', order_by="ScannedPhoto.position")
    utensils = relationship(CookingUtensil, secondary=lambda: recipe_cooking_utensils_table)
    utensil_labels = association_proxy('utensils', 'name')

Our query() method looks something like this (some more joinedloads omitted):

@classmethod
def query(cls, session):
    query = query.options(
        joinedload(cls.ingredients).joinedload(RecipeIngredient.ingredient),
        joinedload(cls.instructions),
        joinedload(cls.scanned_photos),
        joinedload(cls.tags),
        joinedload(cls.authors),
    )
like image 706
Adam F Avatar asked Mar 17 '23 12:03

Adam F


1 Answers

[copying from my answer on the mailing list]

You either need to take order_by off the relationship(), probably the best idea here if ordering isn’t important, or otherwise skip joinedload(), write out the joins yourself and use contains_eager() (http://docs.sqlalchemy.org/en/rel_0_9/orm/loading_relationships.html?highlight=contains_eager#contains-eager).

joinedload() is sort of a macro that creates the joins and other modifications to the query (such as ORDER BY the relationship), applies aliases to each of those parts so that there’s no chance of them conflicting with anything on the query, and then routes the columns from those extra FROM clauses into collections and related objects. contains_eager() does just the very last part of that. The first two parts, writing the joins and orderings and potentially aliasing them (or not), is up to you in that case, so you retain full control over how the query is rendered.

like image 104
zzzeek Avatar answered Mar 23 '23 20:03

zzzeek