COUNT(*) vs. COUNT(1) vs. COUNT(pk): which is better? [duplicate]

Bottom Line

Use either COUNT(field) or COUNT(*), and stick with it consistently, and if your database allows COUNT(tableHere) or COUNT(tableHere.*), use that.

In short, don't use COUNT(1) for anything. It's a one-trick pony, which rarely does what you want, and in those rare cases is equivalent to count(*)

Use `count(*)` for counting

Use * for all your queries that need to count everything, even for joins, use *

SELECT boss.boss_id, COUNT(subordinate.*)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

But don't use COUNT(*) for LEFT joins, as that will return 1 even if the subordinate table doesn't match anything from parent table

SELECT boss.boss_id, COUNT(*)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

Don't be fooled by those advising that when using * in COUNT, it fetches entire row from your table, saying that * is slow. The * on SELECT COUNT(*) and SELECT * has no bearing to each other, they are entirely different thing, they just share a common token, i.e. *.

An alternate syntax

In fact, if it is not permitted to name a field as same as its table name, RDBMS language designer could give COUNT(tableNameHere) the same semantics as COUNT(*). Example:

For counting rows we could have this:

SELECT COUNT(emp) FROM emp

And they could make it simpler:

SELECT COUNT() FROM emp

And for LEFT JOINs, we could have this:

SELECT boss.boss_id, COUNT(subordinate)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

But they cannot do that (COUNT(tableNameHere)) since SQL standard permits naming a field with the same name as its table name:

CREATE TABLE fruit -- ORM-friendly name
(
fruit_id int NOT NULL,
fruit varchar(50), /* same name as table name, 
                and let's say, someone forgot to put NOT NULL */
shape varchar(50) NOT NULL,
color varchar(50) NOT NULL
)

Counting with null

And also, it is not a good practice to make a field nullable if its name matches the table name. Say you have values 'Banana', 'Apple', NULL, 'Pears' on fruit field. This will not count all rows, it will only yield 3, not 4

SELECT count(fruit) FROM fruit

Though some RDBMS do that sort of principle (for counting the table's rows, it accepts table name as COUNT's parameter), this will work in Postgresql (if there is no subordinate field in any of the two tables below, i.e. as long as there is no name conflict between field name and table name):

SELECT boss.boss_id, COUNT(subordinate)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

But that could cause confusion later if we will add a subordinate field in the table, as it will count the field(which could be nullable), not the table rows.

So to be on the safe side, use:

SELECT boss.boss_id, COUNT(subordinate.*)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

`count(1)`: The one-trick pony

In particular to COUNT(1), it is a one-trick pony, it works well only on one table query:

SELECT COUNT(1) FROM tbl

But when you use joins, that trick won't work on multi-table queries without its semantics being confused, and in particular you cannot write:

-- count the subordinates that belongs to boss
SELECT boss.boss_id, COUNT(subordinate.1)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

So what's the meaning of COUNT(1) here?

SELECT boss.boss_id, COUNT(1)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

Is it this...?

-- counting all the subordinates only
SELECT boss.boss_id, COUNT(subordinate.boss_id)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

Or this...?

-- or is that COUNT(1) will also count 1 for boss regardless if boss has a subordinate
SELECT boss.boss_id, COUNT(*)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

By careful thought, you can infer that COUNT(1) is the same as COUNT(*), regardless of type of join. But for LEFT JOINs result, we cannot mold COUNT(1) to work as: COUNT(subordinate.boss_id), COUNT(subordinate.*)

So just use either of the following:

-- count the subordinates that belongs to boss
SELECT boss.boss_id, COUNT(subordinate.boss_id)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

Works on Postgresql, it's clear that you want to count the cardinality of the set

-- count the subordinates that belongs to boss
SELECT boss.boss_id, COUNT(subordinate.*)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

Another way to count the cardinality of the set, very English-like (just don't make a column with a name same as its table name) : http://www.sqlfiddle.com/#!1/98515/7

select boss.boss_name, count(subordinate)
from boss
left join subordinate on subordinate.boss_code = boss.boss_code
group by boss.boss_name

You cannot do this: http://www.sqlfiddle.com/#!1/98515/8

select boss.boss_name, count(subordinate.1)
from boss
left join subordinate on subordinate.boss_code = boss.boss_code
group by boss.boss_name

You can do this, but this produces wrong result: http://www.sqlfiddle.com/#!1/98515/9

select boss.boss_name, count(1)
from boss
left join subordinate on subordinate.boss_code = boss.boss_code
group by boss.boss_name

Two of them always produce the same answer:

COUNT(*) counts the number of rows
COUNT(1) also counts the number of rows

Assuming the pk is a primary key and that no nulls are allowed in the values, then

COUNT(pk) also counts the number of rows

However, if pk is not constrained to be not null, then it produces a different answer:

COUNT(possibly_null) counts the number of rows with non-null values in the column possibly_null.
COUNT(DISTINCT pk) also counts the number of rows (because a primary key does not allow duplicates).
COUNT(DISTINCT possibly_null_or_dup) counts the number of distinct non-null values in the column possibly_null_or_dup.
COUNT(DISTINCT possibly_duplicated) counts the number of distinct (necessarily non-null) values in the column possibly_duplicated when that has the NOT NULL clause on it.

Normally, I write COUNT(*); it is the original recommended notation for SQL. Similarly, with the EXISTS clause, I normally write WHERE EXISTS(SELECT * FROM ...) because that was the original recommend notation. There should be no benefit to the alternatives; the optimizer should see through the more obscure notations.

Asked and answered before...

Books on line says "COUNT ( { [ [ ALL | DISTINCT ] expression ] | * } )"

"1" is a non-null expression so it's the same as COUNT(*). The optimiser recognises it as trivial so gives the same plan. A PK is unique and non-null (in SQL Server at least) so COUNT(PK) = COUNT(*)

This is a similar myth to EXISTS (SELECT * ... or EXISTS (SELECT 1 ...

And see the ANSI 92 spec, section 6.5, General Rules, case 1

        a) If COUNT(*) is specified, then the result is the cardinality
          of T.

        b) Otherwise, let TX be the single-column table that is the
          result of applying the <value expression> to each row of T
          and eliminating null values. If one or more null values are
          eliminated, then a completion condition is raised: warning-
          null value eliminated in set function.

At least on Oracle they are all the same: http://www.oracledba.co.uk/tips/count_speed.htm

Related questions
                            
                                SQL RANK() versus ROW_NUMBER()
                            
                                Check if value exists in Postgres array
                            
                                What did MongoDB not being ACID compliant before v4 really mean?
                            
                                Finding duplicate rows in SQL Server
                            
                                MySQL - Rows to Columns
                            
                                MySQL, better to insert NULL or empty string?
                            
                                Best way to work with dates in Android SQLite [closed]
                            
                                How to get script of SQL Server data? [duplicate]
                            
                                How to compare dates in datetime fields in Postgresql?
                            
                                Use email address as primary key?
                            
                                PostgreSQL Crosstab Query
                            
                                List all tables in postgresql information_schema
                            
                                What are the most common SQL anti-patterns? [closed]
                            
                                Condition within JOIN or WHERE
                            
                                Find a string by searching all tables in SQL Server
                            
                                Select top 10 records for each category
                            
                                Find rows that have the same value on a column in MySQL
                            
                                Cannot simply use PostgreSQL table name ("relation does not exist")
                            
                                The ALTER TABLE statement conflicted with the FOREIGN KEY constraint
                            
                                MySQL dump by query

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

COUNT(*) vs. COUNT(1) vs. COUNT(pk): which is better? [duplicate]

Tags:

sql

select

count

People also ask

Bottom Line

Use `count(*)` for counting

An alternate syntax

Counting with null

`count(1)`: The one-trick pony

Recent Activity

Donate For Us

COUNT(*) vs. COUNT(1) vs. COUNT(pk): which is better? [duplicate]

Tags:

sql

select

count

People also ask

Bottom Line

Use count(*) for counting

An alternate syntax

Counting with null

count(1): The one-trick pony

Related questions

Recent Activity

Donate For Us

Use `count(*)` for counting

`count(1)`: The one-trick pony