At first I want compare count(*) and count(id), which has better performance?
mysql version
5.6.21-1~dotdeb.1-log
table info
PRIMARY KEY (`id`),
KEY `is_availability` (`is_availability`,`is_del`)
ENGINE=InnoDB AUTO_INCREMENT=48993819 DEFAULT CHARSET=utf8
compare without where condition
select count(*) from op_log;
+----------+
| count(*) |
+----------+
| 48989975 |
+----------+
1 row in set (10.02 sec)
select count(id) from op_log ;
+-----------+
| count(id) |
+-----------+
| 48989990 |
+-----------+
1 row in set (12.05 sec)
count(*)
better than count(id)
compare with where condition
select count(*) from op_log where is_availability=1;
+----------+
| count(*) |
+----------+
| 48990038 |
+----------+
1 row in set (15.86 sec)
select count(id) from op_log where is_availability=1;
+-----------+
| count(id) |
+-----------+
| 48990096 |
+-----------+
1 row in set (17.13 sec)
count(*)
still better than count(id)
So if I could draw the conclusion that count(*)
has better performance than count(id)
and why is this?
From High Performance MySQL
, I got
if mysql knows some col cannot be NULL, it will optimize count(col) to count(*) internally
So I suspect that the time spend more is used to do this optimization work.
Often COUNT(*)
would have marginally better performance. COUNT(id)
needs to check if id
is not NULL
for it to work. That means that it needs to read the value (as well as the small overhead for checking NULL
ness).
I would expect the timings to be the same in most databases when id
is a clustered primary key. However, perhaps the MySQL optimizer doesn't bother avoiding the NULL
check even for columns that are declared NOT NULL
.
Note: when doing timings you have to be very careful about starting with a cold cache. In your case, the faster query seems to be the first run, so caching doesn't seem a likely explanation for the difference in performance.
COUNT(*) will count all rows of your table
COUNT(column) will count non-NULLs columns only of your table.
You can check this article:
The thing is count(*) query can use covering index even while count(col) can’t. Of course you can extend index to be (i,val) and get query to be index covered again but I would use this workaround only if you can’t change the query (ie it is third party application) or in case column name is in the query for reason, and you really need count of non-NULL values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With