Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQLAlchemy query shows error "Can't join table/selectable 'workflows' to itself"

I wrote a SQL query that I'm trying to port to SQLAlchemy, but receive the following error:

sqlalchemy.exc.InvalidRequestError: Can't join table/selectable 'workflows' to itself

SQL (working):

SELECT
   w.user_id, COUNT(l.id) 
FROM
   logs as l
INNER JOIN
   workflows as w 
   ON l.workflow_id = w.id
WHERE
   l.type = 's'
   AND l.timestamp > extract(epoch from now()) - 86400
GROUP BY
   w.user_id;

SQLAlchemy (not working):

session.query(
   Workflow.user_id, func.count(Log.id)
).join(
   Workflow, Workflow.id == Log.workflow_id
).filter(
   Log.type == 's', Log.timestamp > time.time() - 86400
).group_by(
   Workflow.user_id
).all()

Here is the expected output:

+----------+---------+
|  user_id |  count  |
+----------+---------+
| 1        | 5       |
| 2        | 10      |
+----------+---------+

What am I doing wrong?

like image 444
okoboko Avatar asked Aug 01 '15 22:08

okoboko


2 Answers

Part

.query(Workflow.user_id, func.count(Log.id))

adds both Workflow and Log to your query. The first model is marked as primary table and others are marked as secondary. If there is no calls to .join() afterwards, both primary and secondary tables will be added to the FROM clause. If there are calls to .join() it will move table it receives to the JOIN clause. The important thing here is that .join() can be applied only to the secondary table.

The problem is that your call to

.join(Workflow, Workflow.id == Log.workflow_id)

tries to mark primary table as joined. To fix issue you need to join secondary table:

.join(Log, Workflow.id == Log.workflow_id)

You can add echo=True to see SQL generated by SQLAlchemy. It's really convenient to debug your queries. Or you can compile single query to see generated SQL.

like image 60
Yaroslav Admin Avatar answered Oct 24 '22 08:10

Yaroslav Admin


As noted before, the primary entity of your query is Workflow, and so the join is trying to join Workflow to itself, which is not possible, at least not without using aliases.

In addition to simply reordering the join you can also explicitly control the left hand side of the join using Query.select_from():

session.query(
    Workflow.user_id, func.count(Log.id)
).select_from(
    Log
).join(
    Workflow, Workflow.id == Log.workflow_id
).filter(
    Log.type == 's', Log.timestamp > time.time() - 86400
).group_by(
    Workflow.user_id
).all()
like image 30
Ilja Everilä Avatar answered Oct 24 '22 08:10

Ilja Everilä