I'm trying to use django's queryset API to emulate the following query: <pre class="prettyprint"><code>SELECT EXTRACT(year FROM chosen_date) AS year, EXTRACT(month FROM chosen_date) AS month, date_paid IS NOT NULL as is_paid FROM (SELECT (CASE WHEN date_due IS NULL THEN date_due ELSE date END) AS chosen_date,* FROM invoice_invoice) as t1; </code></pre> The idea is mainly that in certain situations, I'd rather use the <code>date_due</code> column rather than the <code>date</code> column in some situations, but that , since <code>date_due</code> is optional, I sometimes have to use <code>date</code> as a fallback anyways, and create a computed column <code>chosen_date</code> to not have to change the rest of the queries. Here was a first stab I did at emulating this, I was unable to really see how to properly due the null test with the base api so I went with <code>extra</code>: <pre class="prettyprint"><code>if(use_date_due): sum_qs = sum_qs.extra(select={'chosen_date': 'CASE WHEN date_due IS NULL THEN date ELSE date_due END'}) else: sum_qs = sum_qs.extra(select={'chosen_date':'date'}) sum_qs = sum_qs.extra(select={'year': 'EXTRACT(year FROM chosen_date)', 'month': 'EXTRACT(month FROM chosen_date)', 'is_paid':'date_paid IS NOT NULL'}) </code></pre> But the issue I'm having is when I run the second query, I get an error on how the <code>chosen_date</code> column doesn't exist. I've had similar errors later on when trying to use computed columns (like from within <code>annotate()</code> calls), but haven't found anything in the documentation about how computed columns differ from "base" ones. Does anyone have any insight on this? (edited python code because previous version had an obvious logic flaw (forgot the else branch). still doesn't work)

Well here're some workarounds 1. In your particular case you could do it with one extra: <pre class="prettyprint"><code>if use_date_due: sum_qs = sum_qs.extra(select={ 'year': 'EXTRACT(year FROM coalesce(date_due, date))', 'month': 'EXTRACT(month FROM coalesce(date_due, date))', 'is_paid':'date_paid IS NOT NULL' }) </code></pre> 2. It's also possible to use plain python to get data you need: <pre class="prettyprint"><code>for x in sum_qs: chosen_date = x.date_due if use_date_due and x.date_due else x.date print chosen_date.year, chosen_date.month </code></pre> or <pre class="prettyprint"><code>[(y.year, y.month) for y in (x.date_due if use_date_due and x.date_due else x.date for x in sum_qs)] </code></pre> 3. In the SQL world this type of calculating new fields is usually done by uing subquery or common table expression. I like cte more because of it's readability. It could be like: <pre class="prettyprint"><code>with cte1 as ( select *, coalesce(date_due, date) as chosen_date from polls_invoice ) select *, extract(year from chosen_date) as year, extract(month from chosen_date) as month, case when date_paid is not null then 1 else 0 end as is_paid from cte1 </code></pre> you can also chain as many cte as you want: <pre class="prettyprint"><code>with cte1 as ( select *, coalesce(date_due, date) as chosen_date from polls_invoice ), cte2 as ( select extract(year from chosen_date) as year, extract(month from chosen_date) as month, case when date_paid is not null then 1 else 0 end as is_paid from cte2 ) select year, month, sum(is_paid) as paid_count from cte2 group by year, month </code></pre> so in django you can use raw query like: <pre class="prettyprint"><code>Invoice.objects.raw(' with cte1 as ( select *, coalesce(date_due, date) as chosen_date from polls_invoice ) select *, extract(year from chosen_date) as year, extract(month from chosen_date) as month, case when date_paid is not null then 1 else 0 end as is_paid from cte1') </code></pre> and you will have Invoice objects with some additional properties. 4. Or you can simply substitute fields in your query with plain python <pre class="prettyprint"><code>if use_date_due: chosen_date = 'coalesce(date_due, date)' else: chosen_date = 'date' year = 'extract(year from {})'.format(chosen_date) month = 'extract(month from {})'.format(chosen_date) fields = {'year': year, 'month': month, 'is_paid':'date_paid is not null'}, 'chosen_date':chosen_date) sum_qs = sum_qs.extra(select = fields) </code></pre>

How can I get computed elements of a table in a django queryset?

Tags:

postgresql

django

django-queryset

I'm trying to use django's queryset API to emulate the following query:

SELECT EXTRACT(year FROM chosen_date) AS year, 
EXTRACT(month FROM chosen_date) AS month,
 date_paid IS NOT NULL as is_paid FROM 
    (SELECT (CASE WHEN date_due IS NULL THEN date_due ELSE date END) AS chosen_date,* FROM invoice_invoice) as t1;

The idea is mainly that in certain situations, I'd rather use the date_due column rather than the date column in some situations, but that , since date_due is optional, I sometimes have to use date as a fallback anyways, and create a computed column chosen_date to not have to change the rest of the queries.

Here was a first stab I did at emulating this, I was unable to really see how to properly due the null test with the base api so I went with extra:

if(use_date_due):
    sum_qs = sum_qs.extra(select={'chosen_date': 'CASE WHEN date_due IS NULL THEN date ELSE date_due END'})
else: 
    sum_qs = sum_qs.extra(select={'chosen_date':'date'})
sum_qs = sum_qs.extra(select={'year': 'EXTRACT(year FROM chosen_date)',
                              'month': 'EXTRACT(month FROM chosen_date)',
                              'is_paid':'date_paid IS NOT NULL'})

But the issue I'm having is when I run the second query, I get an error on how the chosen_date column doesn't exist. I've had similar errors later on when trying to use computed columns (like from within annotate() calls), but haven't found anything in the documentation about how computed columns differ from "base" ones. Does anyone have any insight on this?

(edited python code because previous version had an obvious logic flaw (forgot the else branch). still doesn't work)

971

asked Aug 16 '13 08:08

rtpg

2 Answers

Short answer: If you create an aliased (or computed) column using extra(select=...) then you cannot use the aliased column in a subsequent call to filter(). Also, as you've discovered, you can't use the aliased column in later calls to extra(select=...) or extra(where=...).

An attempt to explain why:

For example:

qs = MyModel.objects.extra(select={'alias_col': 'title'})

#FieldError: Cannot resolve keyword 'alias_col' into field...
filter_qs = qs.filter(alias_col='Camembert')

#DatabaseError: column "alias_col" does not exist
extra_qs = qs.extra(select={'another_alias': 'alias_col'})

filter_qs will try to produce a query like:

SELECT (title) AS "alias_col", "myapp_mymodel"."title"
FROM "myapp_mymodel"
WHERE alias_col = "Camembert";

And extra_qs tries something like:

SELECT (title) AS "alias_col", (alias_col) AS "another_alias",
        "myapp_mymodel"."title"
FROM "myapp_mymodel";

Neither of those is valid SQL. In general, if you want to use a computed column's alias multiple times in the SELECT or WHERE clauses of query you actually need to compute it each time. This is why Roman Pekar's answer solves your specific problem - instead of trying to compute chosen_date once and then use it again later he computes it each time it's needed.

You mention Annotation/Aggregation in your question. You can use filter() on the aliases created by annotate() (so I'd be interested in seeing the similar errors you're talking about, it's been fairly robust in my experience). That's because when you try to filter on an alias created by annotate, the ORM recognizes what you're doing and replaces the alias with the computation that created it.

So as an example:

qs = MyModel.objects.annotate(alias_col=Max('id'))
qs = qs.filter(alias_col__gt=0)

Produces something like:

SELECT "myapp_mymodel"."id", "myapp_mymodel"."title",
        MAX("myapp_mymodel"."id") AS "alias_col"
FROM "myapp_mymodel"
GROUP BY "myapp_mymodel"."id", "myapp_mymodel"."title"
HAVING MAX("myapp_mymodel"."id") > 0;

Using "HAVING MAX alias_col > 0" wouldn't work.

I hope that's helpful. If there's anything I've explained badly let me know and I'll see if I can improve it.

105

answered Nov 06 '22 06:11

Kevin

Well here're some workarounds

1. In your particular case you could do it with one extra:

if use_date_due:
    sum_qs = sum_qs.extra(select={
                          'year': 'EXTRACT(year FROM coalesce(date_due, date))',
                          'month': 'EXTRACT(month FROM coalesce(date_due, date))',
                          'is_paid':'date_paid IS NOT NULL'
                        })

2. It's also possible to use plain python to get data you need:

for x in sum_qs:
    chosen_date = x.date_due if use_date_due and x.date_due else x.date
    print chosen_date.year, chosen_date.month

[(y.year, y.month) for y in (x.date_due if use_date_due and x.date_due else x.date for x in sum_qs)]

3. In the SQL world this type of calculating new fields is usually done by uing subquery or common table expression. I like cte more because of it's readability. It could be like:

with cte1 as (
    select
        *, coalesce(date_due, date) as chosen_date
    from polls_invoice
)
select
    *,
    extract(year from chosen_date) as year,
    extract(month from chosen_date) as month,
    case when date_paid is not null then 1 else 0 end as is_paid
from cte1

you can also chain as many cte as you want:

with cte1 as (
    select
        *, coalesce(date_due, date) as chosen_date
    from polls_invoice
), cte2 as (
    select
        extract(year from chosen_date) as year,
        extract(month from chosen_date) as month,
        case when date_paid is not null then 1 else 0 end as is_paid
    from cte2
)
select
    year, month, sum(is_paid) as paid_count
from cte2
group by year, month

so in django you can use raw query like:

Invoice.objects.raw('
     with cte1 as (
        select
            *, coalesce(date_due, date) as chosen_date
        from polls_invoice
    )
    select
        *,
        extract(year from chosen_date) as year,
        extract(month from chosen_date) as month,
        case when date_paid is not null then 1 else 0 end as is_paid
    from cte1')

and you will have Invoice objects with some additional properties.

4. Or you can simply substitute fields in your query with plain python

if use_date_due:
    chosen_date = 'coalesce(date_due, date)'
else: 
    chosen_date = 'date'

year = 'extract(year from {})'.format(chosen_date)
month = 'extract(month from {})'.format(chosen_date)
fields = {'year': year, 'month': month, 'is_paid':'date_paid is not null'}, 'chosen_date':chosen_date)
sum_qs = sum_qs.extra(select = fields)

answered Nov 06 '22 06:11

Roman Pekar

Related questions
                            
                                Reordering fields in Django model
                            
                                Opensource real django projects [closed]
                            
                                Django ORM way of going through multiple Many-to-Many relationship
                            
                                How do I exclude an inherited field in a form in Django?
                            
                                What is the best Wiki solution for Django? [closed]
                            
                                Django Debug Toolbar not displaying SQL
                            
                                django/celery: Best practices to run tasks on 150k Django objects?
                            
                                Django admin inline model for User
                            
                                django: sqlite3.OperationalError: no such table
                            
                                Get the integer part of a Decimal field in a Django template
                            
                                How to make some filters mandatory in tastypie?
                            
                                DJANGO: How to sort objects based on attribute of a related model?
                            
                                Two arguments in django-admin custom command
                            
                                how can I specify multiple "init_command"s in Django's setup file
                            
                                How to load the foreign keys elements in Tastypie
                            
                                how to include django templates in app installed via pip?
                            
                                SQLAlchemy conversion of Django annotate
                            
                                Django: get related set from a related set of a model
                            
                                Add extra Fields to Django to Django Auth Groups
                            
                                Getting mongoengine and django rest framework to play nice

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With