I'm attempting to implement the following query for handling nested sets (see here) within SQLAlchemy. What I'm struggling with is how to use the labeled depth
calculation in the main SELECT
query (which depends on the sub SELECT
query) within the HAVING
clause on the end.
SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth
FROM nested_category AS node,
nested_category AS parent,
nested_category AS sub_parent,
(
SELECT node.name, (COUNT(parent.name) - 1) AS depth
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.name = 'PORTABLE ELECTRONICS'
GROUP BY node.name
ORDER BY node.lft
)AS sub_tree
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt
AND sub_parent.name = sub_tree.name
GROUP BY node.name
HAVING depth <= 1
ORDER BY node.lft;
I feel like I'm very close when using:
node = aliased(Category)
parent = aliased(Category)
sub_parent = aliased(Category)
sub_tree = DBSession.query(node.name,
(func.count(parent.name) - 1).label('depth')).\
filter(node.lft.between(parent.lft, parent.rgt)).\
filter(node.name == category_name).\
group_by(node.name).\
order_by(node.lft).subquery()
children = DBSession.query(node.name,
(func.count(parent.name) - (sub_tree.c.depth + 1)).label('depth')).\
filter(node.lft.between(parent.lft, parent.rgt)).\
filter(node.lft.between(sub_parent.lft, sub_parent.rgt)).\
filter(sub_parent.name == sub_tree.c.name).\
group_by(node.name).having(depth <= 1).\
order_by(node.lft).all()
But, I end up getting the error:
NameError: global name 'depth' is not defined
Which sort of makes sense. If I substitute having(depth <= 1)
with having(func.count('depth') <= 1
, I end up with the following HAVING
clause generated that returns no results (Where %s placeholders are ('depth', 1)):
HAVING count(%s) <= %s
What I really need it to read exactly like is this:
HAVING depth = 1
Anybody have any ideas?
My last resort is to actually perform the raw query instead of going through the ORM layer, but I'd really rather not since I'm so close...
Thanks in advance.
Edit:
I have also tried the following code, but it does not return the correct results (as if the 'depth' label is always 0):
node = aliased(Category)
parent = aliased(Category)
sub_parent = aliased(Category)
sub_tree_depth = (func.count(parent.name) - 1).label('depth')
depth = (func.count(parent.name) - (sub_tree_depth + 1)).label('depth')
sub_tree = DBSession.query(node.name,
sub_tree_depth).\
filter(node.lft.between(parent.lft, parent.rgt)).\
filter(node.name == category_name).\
group_by(node.name).\
order_by(node.lft).subquery()
children = DBSession.query(node.name,
depth).\
filter(node.lft.between(parent.lft, parent.rgt)).\
filter(node.lft.between(sub_parent.lft, sub_parent.rgt)).\
filter(sub_parent.name == sub_tree.c.name).\
group_by(node.name).having(depth <= 1).\
order_by(node.lft).all()
The HAVING
clause generated from this looks like (categories_2 == parent in original query):
HAVING count(categories_2.name) - ((count(categories_2.name) - 1) + 1) <= 1
Edit:
I figured it might be helpful to include the generated SQL.
SQLAlchemy
node = aliased(Category)
parent = aliased(Category)
sub_parent = aliased(Category)
sub_tree = DBSession.query(node.name,
(func.count(parent.name) - 1).label('depth')).\
filter(node.lft.between(parent.lft, parent.rgt)).\
filter(node.name == category_name).\
group_by(node.name).\
order_by(node.lft).subquery()
depth = (func.count(parent.name) - (sub_tree.c.depth + 1)).label('depth')
children = DBSession.query(node.name, depth).\
filter(node.lft.between(parent.lft, parent.rgt)).\
filter(node.lft.between(sub_parent.lft, sub_parent.rgt)).\
filter(sub_parent.name == sub_tree.c.name).\
group_by(node.name).having(depth <= 1).\
order_by(node.lft)
Generated SQL
'SELECT categories_1.name AS categories_1_name, count(categories_2.name) - (anon_1.depth + %s) AS depth
FROM categories AS categories_1, categories AS categories_2, (SELECT categories_1.name AS name, count(categories_2.name) - %s AS depth
FROM categories AS categories_1, categories AS categories_2
WHERE categories_1.lft BETWEEN categories_2.lft AND categories_2.rgt AND categories_1.name = %s GROUP BY categories_1.name ORDER BY categories_1.lft) AS anon_1, categories AS categories_3
WHERE categories_1.lft BETWEEN categories_2.lft AND categories_2.rgt AND categories_1.lft BETWEEN categories_3.lft AND categories_3.rgt AND categories_3.name = anon_1.name GROUP BY categories_1.name
HAVING count(categories_2.name) - (anon_1.depth + %s) <= %s ORDER BY categories_1.lft' (1, 1, u'Institutional', 1, 1)
all() method. The Query object, when asked to return full entities, will deduplicate entries based on primary key, meaning if the same primary key value would appear in the results more than once, only one object of that primary key would be present.
As the documentation says, all() returns the result of the query as a list.
all() will return all records which match our query as a list of objects.
The grouping is done with the group_by() query method, which takes the column to use for the grouping as an argument, same as the GROUP BY counterpart in SQL. The statement ends by calling subquery() , which tells SQLAlchemy that our intention for this query is to use it inside a bigger query instead of on its own.
Your SQL query uses implicit joins, in SQLAlchemy you need to define them explicitly. Other than that your second try is almost correct:
node = aliased(Category)
parent = aliased(Category)
sub_parent = aliased(Category)
sub_tree = DBSession.query(node.name,
(func.count(parent.name) - 1).label('depth')).\
join(parent, node.lft.between(parent.lft, parent.rgt)).\
filter(node.name == category_name).\
group_by(node.name).\
order_by(node.lft).subquery()
depth = (func.count(parent.name) - (sub_tree.c.depth + 1)).label('depth')
children = DBSession.query(node.name, depth).\
join(parent, node.lft.between(parent.lft, parent.rgt)).\
join(sub_parent, node.lft.between(sub_parent.lft, sub_parent.rgt)).\
join(sub_tree, sub_parent.name == sub_tree.c.name).\
group_by(node.name, sub_tree.c.depth).\
having(depth <= 1).\
order_by(node.lft).all()
Don't be surprised if HAVING
clause in generated SQL will have full expression repeated instead of its alias. That's because aliases are not allowed there, it's only MySQL extension, while SQLAlchemy strives to generate SQL that works in most cases.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With