Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Labels in HAVING() Clause in SQLAlchemy

I'm attempting to implement the following query for handling nested sets (see here) within SQLAlchemy. What I'm struggling with is how to use the labeled depth calculation in the main SELECT query (which depends on the sub SELECT query) within the HAVING clause on the end.

SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth
FROM nested_category AS node,
    nested_category AS parent,
    nested_category AS sub_parent,
    (
            SELECT node.name, (COUNT(parent.name) - 1) AS depth
            FROM nested_category AS node,
                    nested_category AS parent
            WHERE node.lft BETWEEN parent.lft AND parent.rgt
                    AND node.name = 'PORTABLE ELECTRONICS'
            GROUP BY node.name
            ORDER BY node.lft
    )AS sub_tree
WHERE node.lft BETWEEN parent.lft AND parent.rgt
    AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt
    AND sub_parent.name = sub_tree.name
GROUP BY node.name
HAVING depth <= 1
ORDER BY node.lft;

I feel like I'm very close when using:

node = aliased(Category)                                                                             
parent = aliased(Category)                                                                           
sub_parent = aliased(Category)

sub_tree = DBSession.query(node.name,                                                                
    (func.count(parent.name) - 1).label('depth')).\                                                  
    filter(node.lft.between(parent.lft, parent.rgt)).\                                               
    filter(node.name == category_name).\                                                             
    group_by(node.name).\                                                                            
    order_by(node.lft).subquery()                                                                    

children = DBSession.query(node.name,                                                                
    (func.count(parent.name) - (sub_tree.c.depth + 1)).label('depth')).\                             
    filter(node.lft.between(parent.lft, parent.rgt)).\
    filter(node.lft.between(sub_parent.lft, sub_parent.rgt)).\                                       
    filter(sub_parent.name == sub_tree.c.name).\                                                     
    group_by(node.name).having(depth <= 1).\                                                       
    order_by(node.lft).all()

But, I end up getting the error:

NameError: global name 'depth' is not defined

Which sort of makes sense. If I substitute having(depth <= 1) with having(func.count('depth') <= 1, I end up with the following HAVING clause generated that returns no results (Where %s placeholders are ('depth', 1)):

HAVING count(%s) <= %s

What I really need it to read exactly like is this:

HAVING depth = 1

Anybody have any ideas?

My last resort is to actually perform the raw query instead of going through the ORM layer, but I'd really rather not since I'm so close...

Thanks in advance.

Edit:

I have also tried the following code, but it does not return the correct results (as if the 'depth' label is always 0):

node = aliased(Category)                                                                             
parent = aliased(Category)                                                                           
sub_parent = aliased(Category)

sub_tree_depth = (func.count(parent.name) - 1).label('depth')
depth = (func.count(parent.name) - (sub_tree_depth + 1)).label('depth')

sub_tree = DBSession.query(node.name,
    sub_tree_depth).\
    filter(node.lft.between(parent.lft, parent.rgt)).\
    filter(node.name == category_name).\
    group_by(node.name).\
    order_by(node.lft).subquery()

children = DBSession.query(node.name, 
    depth).\
    filter(node.lft.between(parent.lft, parent.rgt)).\
    filter(node.lft.between(sub_parent.lft, sub_parent.rgt)).\
    filter(sub_parent.name == sub_tree.c.name).\
    group_by(node.name).having(depth <= 1).\
    order_by(node.lft).all()

The HAVING clause generated from this looks like (categories_2 == parent in original query):

HAVING count(categories_2.name) - ((count(categories_2.name) - 1) + 1) <= 1

Edit:

I figured it might be helpful to include the generated SQL.

SQLAlchemy

node = aliased(Category)
parent = aliased(Category)
sub_parent = aliased(Category)

sub_tree = DBSession.query(node.name,
    (func.count(parent.name) - 1).label('depth')).\
    filter(node.lft.between(parent.lft, parent.rgt)).\
    filter(node.name == category_name).\
    group_by(node.name).\
    order_by(node.lft).subquery()

depth = (func.count(parent.name) - (sub_tree.c.depth + 1)).label('depth')
children = DBSession.query(node.name, depth).\
    filter(node.lft.between(parent.lft, parent.rgt)).\
    filter(node.lft.between(sub_parent.lft, sub_parent.rgt)).\
    filter(sub_parent.name == sub_tree.c.name).\
    group_by(node.name).having(depth <= 1).\
    order_by(node.lft)

Generated SQL

'SELECT categories_1.name AS categories_1_name, count(categories_2.name) - (anon_1.depth + %s) AS depth 
FROM categories AS categories_1, categories AS categories_2, (SELECT categories_1.name AS name, count(categories_2.name) - %s AS depth 
FROM categories AS categories_1, categories AS categories_2 
WHERE categories_1.lft BETWEEN categories_2.lft AND categories_2.rgt AND categories_1.name = %s GROUP BY categories_1.name ORDER BY categories_1.lft) AS anon_1, categories AS categories_3 
WHERE categories_1.lft BETWEEN categories_2.lft AND categories_2.rgt AND categories_1.lft BETWEEN categories_3.lft AND categories_3.rgt AND categories_3.name = anon_1.name GROUP BY categories_1.name 
HAVING count(categories_2.name) - (anon_1.depth + %s) <= %s ORDER BY categories_1.lft' (1, 1, u'Institutional', 1, 1)
like image 468
dhildreth Avatar asked Jul 02 '14 05:07

dhildreth


People also ask

What does all () do in SQLAlchemy?

all() method. The Query object, when asked to return full entities, will deduplicate entries based on primary key, meaning if the same primary key value would appear in the results more than once, only one object of that primary key would be present.

What does SQLAlchemy all () return?

As the documentation says, all() returns the result of the query as a list.

What does query all () return?

all() will return all records which match our query as a list of objects.

What is subquery in SQLAlchemy?

The grouping is done with the group_by() query method, which takes the column to use for the grouping as an argument, same as the GROUP BY counterpart in SQL. The statement ends by calling subquery() , which tells SQLAlchemy that our intention for this query is to use it inside a bigger query instead of on its own.


Video Answer


1 Answers

Your SQL query uses implicit joins, in SQLAlchemy you need to define them explicitly. Other than that your second try is almost correct:

node = aliased(Category)
parent = aliased(Category)
sub_parent = aliased(Category)

sub_tree = DBSession.query(node.name,
    (func.count(parent.name) - 1).label('depth')).\
    join(parent, node.lft.between(parent.lft, parent.rgt)).\
    filter(node.name == category_name).\
    group_by(node.name).\
    order_by(node.lft).subquery()

depth = (func.count(parent.name) - (sub_tree.c.depth + 1)).label('depth')
children = DBSession.query(node.name, depth).\
    join(parent, node.lft.between(parent.lft, parent.rgt)).\
    join(sub_parent, node.lft.between(sub_parent.lft, sub_parent.rgt)).\
    join(sub_tree, sub_parent.name == sub_tree.c.name).\
    group_by(node.name, sub_tree.c.depth).\
    having(depth <= 1).\
    order_by(node.lft).all()

Don't be surprised if HAVING clause in generated SQL will have full expression repeated instead of its alias. That's because aliases are not allowed there, it's only MySQL extension, while SQLAlchemy strives to generate SQL that works in most cases.

like image 160
Audrius Kažukauskas Avatar answered Oct 07 '22 04:10

Audrius Kažukauskas