I am trying to get the following to return a count for every organization using a left join in PostgreSQL, but I cannot figure out why it's not working: <pre class="prettyprint"><code> select o.name as organisation_name, coalesce(COUNT(exam_items.id)) as total_used from organisations o left join exam_items e on o.id = e.organisation_id where e.item_template_id = #{sanitize(item_template_id)} and e.used = true group by o.name order by o.name </code></pre> Using <code>coalesce</code> doesn't seem to work. I'm at my wit's end! Any help would certainly be appreciated! To clarify what's not working, at the moment the query only returns values for organisations that have a count greater than 0. I would like it to return a line for every organisation, regardless of the count. Table definitions: <pre class="prettyprint"><code>TABLE exam_items id serial NOT NULL exam_id integer item_version_id integer used boolean DEFAULT false question_identifier character varying(255) organisation_id integer created_at timestamp without time zone NOT NULL updated_at timestamp without time zone NOT NULL item_template_id integer stem_id integer CONSTRAINT exam_items_pkey PRIMARY KEY (id) TABLE organisations id serial NOT NULL slug character varying(255) name character varying(255) code character varying(255) address text organisation_type integer created_at timestamp without time zone NOT NULL updated_at timestamp without time zone NOT NULL super boolean DEFAULT false CONSTRAINT organisations_pkey PRIMARY KEY (id) </code></pre>

<h3>Fix the <code>LEFT JOIN</code> </h3> This should work: <pre class="prettyprint"><code>SELECT o.name AS organisation_name, count(e.id) AS total_used FROM organisations o LEFT JOIN exam_items e ON e.organisation_id = o.id AND e.item_template_id = #{sanitize(item_template_id)} AND e.used GROUP BY o.name ORDER BY o.name; </code></pre> You had a <code>LEFT [OUTER] JOIN</code> but the later <code>WHERE</code> conditions made it act like a plain <code>[INNER] JOIN</code>. Move the condition(s) to the <code>JOIN</code> clause to make it work as intended. This way, only rows that fulfill all these conditions are joined in the first place (or columns from the right table are filled with NULL). Like you had it, joined rows are tested for additional conditions virtually after the <code>LEFT JOIN</code> and removed if they don't pass, just like with a plain <code>JOIN</code>. <code>count()</code> never returns NULL to begin with. It's an exception among aggregate functions in this respect. Therefore, <strike><code>COALESCE(COUNT(col))</code></strike> never makes sense, even with additional parameters. The manual: <blockquote> It should be noted that except for <code>count</code>, these functions return a null value when no rows are selected. </blockquote> Bold emphasis mine. See: <ul> <li>Count the number of attributes that are NULL for a row</li> </ul> <code>count()</code> must be on a column defined <code>NOT NULL</code> (like <code>e.id</code>), or where the join condition guarantees <code>NOT NULL</code> (<code>e.organisation_id</code>, <code>e.item_template_id</code>, or <code>e.used</code>) in the example. Since <code>used</code> is type <code>boolean</code>, the expression <code>e.used = true</code> is noise that burns down to just <code>e.used</code>. Since <code>o.name</code> is not defined <code>UNIQUE NOT NULL</code>, you may want to <code>GROUP BY o.id</code> instead (<code>id</code> being the PK) - unless you intend to fold rows with the same name (including NULL). <h3>Aggregate first, join later</h3> If most or all rows of <code>exam_items</code> are counted in the process, this equivalent query is typically considerably faster / cheaper: <pre class="prettyprint lang-sql prettyprint-override"><code>SELECT o.id, o.name AS organisation_name, e.total_used FROM organisations o LEFT JOIN ( SELECT organisation_id AS id -- alias to simplify join syntax , count(*) AS total_used -- count(*) = fastest to count all FROM exam_items WHERE item_template_id = #{sanitize(item_template_id)} AND used GROUP BY 1 ) e USING (id) ORDER BY o.name, o.id; </code></pre> (This is assuming that you don't want to fold rows with the same name like mentioned above - the typical case.) Now we can use the faster / simpler <code>count(*)</code> in the subquery, and we need no <code>GROUP BY</code> in the outer <code>SELECT</code>. See: <ul> <li>Multiple array_agg() calls in a single query</li> </ul>

Query with LEFT JOIN not returning rows for count of 0

Tags:

sql

postgresql

count

left-join

I am trying to get the following to return a count for every organization using a left join in PostgreSQL, but I cannot figure out why it's not working:

  select o.name as organisation_name,          coalesce(COUNT(exam_items.id)) as total_used   from organisations o   left join exam_items e on o.id = e.organisation_id   where e.item_template_id = #{sanitize(item_template_id)}   and e.used = true   group by o.name   order by o.name

Using coalesce doesn't seem to work. I'm at my wit's end! Any help would certainly be appreciated!

To clarify what's not working, at the moment the query only returns values for organisations that have a count greater than 0. I would like it to return a line for every organisation, regardless of the count.

Table definitions:

TABLE exam_items   id serial NOT NULL   exam_id integer   item_version_id integer   used boolean DEFAULT false   question_identifier character varying(255)   organisation_id integer   created_at timestamp without time zone NOT NULL   updated_at timestamp without time zone NOT NULL   item_template_id integer   stem_id integer   CONSTRAINT exam_items_pkey PRIMARY KEY (id)  TABLE organisations   id serial NOT NULL   slug character varying(255)   name character varying(255)   code character varying(255)   address text   organisation_type integer   created_at timestamp without time zone NOT NULL   updated_at timestamp without time zone NOT NULL   super boolean DEFAULT false   CONSTRAINT organisations_pkey PRIMARY KEY (id)

581

asked Mar 17 '13 23:03

mulus

1 Answers

Fix the `LEFT JOIN`

This should work:

SELECT o.name AS organisation_name, count(e.id) AS total_used FROM   organisations   o LEFT   JOIN exam_items e ON e.organisation_id = o.id                          AND e.item_template_id = #{sanitize(item_template_id)}                         AND e.used GROUP  BY o.name ORDER  BY o.name;

You had a LEFT [OUTER] JOIN but the later WHERE conditions made it act like a plain [INNER] JOIN.
Move the condition(s) to the JOIN clause to make it work as intended. This way, only rows that fulfill all these conditions are joined in the first place (or columns from the right table are filled with NULL). Like you had it, joined rows are tested for additional conditions virtually after the LEFT JOIN and removed if they don't pass, just like with a plain JOIN.

count() never returns NULL to begin with. It's an exception among aggregate functions in this respect. Therefore, ~~COALESCE(COUNT(col))~~ never makes sense, even with additional parameters. The manual:

It should be noted that except for count, these functions return a null value when no rows are selected.

Bold emphasis mine. See:

Count the number of attributes that are NULL for a row

count() must be on a column defined NOT NULL (like e.id), or where the join condition guarantees NOT NULL (e.organisation_id, e.item_template_id, or e.used) in the example.

Since used is type boolean, the expression e.used = true is noise that burns down to just e.used.

Since o.name is not defined UNIQUE NOT NULL, you may want to GROUP BY o.id instead (id being the PK) - unless you intend to fold rows with the same name (including NULL).

Aggregate first, join later

If most or all rows of exam_items are counted in the process, this equivalent query is typically considerably faster / cheaper:

SELECT o.id, o.name AS organisation_name, e.total_used FROM   organisations o LEFT   JOIN (    SELECT organisation_id AS id   -- alias to simplify join syntax         , count(*) AS total_used  -- count(*) = fastest to count all    FROM   exam_items    WHERE  item_template_id = #{sanitize(item_template_id)}    AND    used    GROUP  BY 1    ) e USING (id) ORDER  BY o.name, o.id;

(This is assuming that you don't want to fold rows with the same name like mentioned above - the typical case.)

Now we can use the faster / simpler count(*) in the subquery, and we need no GROUP BY in the outer SELECT.

See:

Multiple array_agg() calls in a single query

178

answered Nov 05 '22 20:11

Erwin Brandstetter

Related questions
                            
                                Select rows where column is null
                            
                                What is the best approach using JDBC for parameterizing an IN clause? [duplicate]
                            
                                Use linq to generate direct update without select
                            
                                Anonymous type result from sql query execution entity framework
                            
                                Split String by delimiter position using oracle SQL
                            
                                Call a stored procedure with another in Oracle
                            
                                How to access ssis package variables inside script component
                            
                                VS2012 Post-Deployment script referring to several other scripts
                            
                                SQL Inner Join On Null Values
                            
                                Add date to SQL database backup filename
                            
                                List of non-empty tables in MySQL database
                            
                                MAX vs Top 1 - which is better?
                            
                                Python, SQLAlchemy pass parameters in connection.execute
                            
                                SQL Server - use columns from the main query in the subquery
                            
                                What is the SQL operator name for "<>"?
                            
                                SELECT COUNT in LINQ to SQL C#
                            
                                SQL count(*) performance
                            
                                CTE within a CTE
                            
                                Order items in MySQL by a fixed list?
                            
                                Alter data type of a column to serial

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Query with LEFT JOIN not returning rows for count of 0

Tags:

sql

postgresql

count

left-join

mulus

People also ask

1 Answers

Fix the `LEFT JOIN`

Aggregate first, join later

Erwin Brandstetter

Recent Activity

Donate For Us

Query with LEFT JOIN not returning rows for count of 0

Tags:

sql

postgresql

count

left-join

mulus

People also ask

1 Answers

Fix the LEFT JOIN

Aggregate first, join later

Erwin Brandstetter

Related questions

Recent Activity

Donate For Us

Fix the `LEFT JOIN`