Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQLAlchemy Nested CTE Query

The sqlalchemy core query builder appears to unnest and relocate CTE queries to the "top" of the compiled sql.

I'm converting an existing Postgres query that selects deeply joined data as a single JSON object. The syntax is pretty contrived but it significantly reduces network overhead for large queries. The goal is to build the query dynamically using the sqlalchemy core query builder.

Here's a minimal working example of a nested CTE

with res_cte as (
    select
        account_0.name acct_name,
        (
            with offer_cte as (
                select
                    offer_0.id
                from
                    offer offer_0
                where
                    offer_0.account_id = account_0.id
            )
            select
                array_agg(offer_cte.id)
            from
                offer_cte
        ) as offer_arr
    from
        account account_0
)
select 
    acct_name::text, offer_arr::text
from res_cte

Result

acct_name,  offer_arr
---------------------
oliver,     null
rachel,     {3}
buddy,      {4,5}

(my incorrect use of) the core query builder attempts to unnest offer_cte and results in every offer.id being associated with every account_name in the result.

There's no need to re-implement this exact query in an answer, any example that results in a similarly nested CTE would be perfect.

like image 208
Oliver Rice Avatar asked Oct 22 '19 23:10

Oliver Rice


1 Answers

I just implemented the nesting cte feature. It should land with 1.4.24 release.

Pull request: https://github.com/sqlalchemy/sqlalchemy/pull/6709

import sqlalchemy as sa
from sqlalchemy.ext.declarative import declarative_base

# Model declaration
Base = declarative_base()

class Offer(Base):
    __tablename__ = "offer"
    
    id = sa.Column(sa.Integer, primary_key=True)
    account_id = sa.Column(sa.Integer, nullable=False)
    
class Account(Base):
    __tablename__ = "account"

    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.TEXT, nullable=False)

# Query construction
account_0 = sa.orm.aliased(Account)

# Watch the nesting keyword set to True
offer_cte = (
    sa.select(Offer.id)
    .where(Offer.account_id == account_0.id)
    .select_from(Offer)
    .correlate(account_0).cte("offer_cte", nesting=True)
)
offer_arr = sa.select(sa.func.array_agg(offer_cte.c.id).label("offer_arr"))

res_cte = sa.select(
    account_0.name.label("acct_name"),
    offer_arr.scalar_subquery().label("offer_arr"),
).cte("res_cte")

final_query = sa.select(
    sa.cast(res_cte.c.acct_name, sa.TEXT),
    sa.cast(res_cte.c.offer_arr, sa.TEXT),
)

It constructs this query that returns the result you expect:

WITH res_cte AS 
(
    SELECT
        account_1.name AS acct_name
        , (
            WITH offer_cte AS 
(
                SELECT
                    offer.id AS id
                FROM
                    offer
                WHERE
                    offer.account_id = account_1.id
            )
            SELECT
                array_agg(offer_cte.id) AS offer_arr
            FROM
                offer_cte
        ) AS offer_arr
    FROM
        account AS account_1
)
 SELECT
    CAST(res_cte.acct_name AS TEXT) AS acct_name
    , CAST(res_cte.offer_arr AS TEXT) AS offer_arr
FROM
    res_cte
like image 68
Eric Masseran Avatar answered Nov 18 '22 11:11

Eric Masseran