In Redshift/Postgres, how to count rows that meet a condition?

Q: How do you COUNT rows in redshift?

The COUNT function counts the rows defined by the expression. The COUNT function has three variations. COUNT ( * ) counts all the rows in the target table whether they include nulls or not. COUNT ( expression ) computes the number of rows with non-NULL values in a specific column or expression.

Q: How do I COUNT rows in PostgreSQL?

The basic SQL standard query to count the rows in a table is: SELECT count(*) FROM table_name; This can be rather slow because PostgreSQL has to check visibility for all rows, due to the MVCC model.

Q: How do I create a COUNT query in PostgreSQL?

The PostgreSQL COUNT function counts a number of rows or non-NULL values against a specific column from a table. When an asterisk(*) is used with count function the total number of rows returns. The asterisk(*) indicates all the rows. This clause is optional.

Tags:

postgresql

amazon-web-services

amazon-redshift

I'm trying to write a query that count only the rows that meet a condition.

For example, in MySQL I would write it like this:

SELECT     COUNT(IF(grade < 70), 1, NULL) FROM     grades ORDER BY     id DESC;

However, when I attempt to do that on Redshift, it returns the following error:

ERROR: function if(boolean, integer, "unknown") does not exist

Hint: No function matches the given name and argument types. You may need to add explicit type casts.

I checked the documentation for conditional statements, and I found

NULLIF(value1, value2)

but it only compares value1 and value2 and if such values are equal, it returns null.

I couldn't find a simple IF statement, and at first glance I couldn't find a way to do what I want to do.

I tried to use the CASE expression, but I'm not getting the results I want:

SELECT      CASE         WHEN grade < 70 THEN COUNT(rank)         ELSE COUNT(rank)     END FROM    grades

This is the way I want to count things:

failed (grade < 70)
average (70 <= grade < 80)
good (80 <= grade < 90)
excellent (90 <= grade <= 100)

and this is how I expect to see the results:

+========+=========+======+===========+ | failed | average | good | excellent | +========+=========+======+===========+ |   4    |    2    |  1   |     4     | +========+=========+======+===========+

but I'm getting this:

+========+=========+======+===========+ | failed | average | good | excellent | +========+=========+======+===========+ |  11    |   11    |  11  |    11     | +========+=========+======+===========+

I hope someone could point me to the right direction!

If this helps here's some sample info

CREATE TABLE grades(   grade integer DEFAULT 0, );  INSERT INTO grades(grade) VALUES(69, 50, 55, 60, 75, 70, 87, 100, 100, 98, 94);

779

asked Jan 22 '14 16:01

ILikeTacos

1 Answers

First, the issue you're having here is that what you're saying is "If the grade is less than 70, the value of this case expression is count(rank). Otherwise, the value of this expression is count(rank)." So, in either case, you're always getting the same value.

SELECT      CASE         WHEN grade < 70 THEN COUNT(rank)         ELSE COUNT(rank)     END FROM    grades

count() only counts non-null values, so typically the pattern you'll see to accomplish what you're trying is this:

SELECT      count(CASE WHEN grade < 70 THEN 1 END) as grade_less_than_70,     count(CASE WHEN grade >= 70 and grade < 80 THEN 1 END) as grade_between_70_and_80 FROM    grades

That way the case expression will only evaluate to 1 when the test expression is true and will be null otherwise. Then the count() will only count the non-null instances, i.e. when the test expression is true, which should give you what you need.

Edit: As a side note, notice that this is exactly the same as how you had originally written this using count(if(test, true-value, false-value)), only re-written as count(case when test then true-value end) (and null is the stand in false-value since an else wasn't supplied to the case).

Edit: postgres 9.4 was released a few months after this original exchange. That version introduced aggregate filters, which can make scenarios like this look a little nicer and clearer. This answer still gets some occasional upvotes, so if you've stumbled upon here and are using a newer postgres (i.e. 9.4+) you might want to consider this equivalent version:

SELECT     count(*) filter (where grade < 70) as grade_less_than_70,     count(*) filter (where grade >= 70 and grade < 80) as grade_between_70_and_80 FROM    grades

180

answered Sep 19 '22 16:09

yieldsfalsehood

Related questions
                            
                                pg_restore: [archiver] unsupported version (1.14) in file header
                            
                                Setting application_name on Postgres/SQLAlchemy
                            
                                PostgreSQL copy command generate primary key id
                            
                                How to query for null values in json field type postgresql?
                            
                                Why psql can't find relation name for existing table?
                            
                                Stop (long) running SQL query in PostgreSQL when session or requests no longer exist?
                            
                                Concatenate multiple rows in an array with SQL on PostgreSQL
                            
                                PostgreSQL UNIX domain sockets vs TCP sockets
                            
                                Primary key for multiple column in PostgreSQL?
                            
                                How to use homebrew to downgrade postgresql from 10.1 to 9.6 on Mac OS [closed]
                            
                                How to read all rows from huge table?
                            
                                DROP FUNCTION without knowing the number/type of parameters?
                            
                                Retrieving Comments from a PostgreSQL DB
                            
                                Postgres multiple joins
                            
                                Postgres Query execution time
                            
                                How can I get a plain text postgres database dump on heroku?
                            
                                Primary & Foreign Keys in pgAdmin
                            
                                encoding UTF8 does not match locale en_US; the chosen LC_CTYPE setting requires encoding LATIN1
                            
                                How to get the number of deleted rows in PostgreSQL?
                            
                                Cast syntax to convert a sum to float

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With