Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why cardinality value in mysql indexes don't equal distinct count for column values

Not a long ago I started to optimize queries for mysql db. I created some indexes and decided to see their params, using show index from syntax. And saw that cardinality doesn't equal distinct count for some columns.

So, I have the table

CREATE TABLE `item_owners` (
  `uid` varchar(255) NOT NULL,
  `version` bigint(20) NOT NULL,
  `type_id` varchar(255) NOT NULL,
  `owner_name` varchar(255) NOT NULL,
  `item_id` varchar(255) NOT NULL,
  `status_id` varchar(255) NOT NULL,
  PRIMARY KEY (`uid`),
  KEY `FK181EADBC7346EE24` (`status_id`),
  KEY `FK181EADBC90094D43` (`type_id`),
  KEY `FK181EADBC499E38CA` (`item_id`),
  CONSTRAINT `FK181EADBC499E38CA` FOREIGN KEY (`item_id`) REFERENCES `items` (`uid`),
  CONSTRAINT `FK181EADBC7346EE24` FOREIGN KEY (`status_id`) REFERENCES `statuses` (`uid`),
  CONSTRAINT `FK181EADBC90094D43` FOREIGN KEY (`type_id`) REFERENCES `types` (`uid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

And in that table I have 2M rows(and indeed 2M uniq PK), 500k uniq item_id, 5 uniq status_id and 20 uniq type_id

But when I execute

show index from item_owners

I've got

+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table       | Non_unique | Key_name           | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| item_owners |          0 | PRIMARY            |            1 | uid         | A         |     1893427 |     NULL | NULL   |      | BTREE      |         |               |
| item_owners |          1 | FK181EADBC7346EE24 |            1 | status_id   | A         |          17 |     NULL | NULL   |      | BTREE      |         |               |
| item_owners |          1 | FK181EADBC90094D43 |            1 | type_id     | A         |          17 |     NULL | NULL   |      | BTREE      |         |               |
| item_owners |          1 | FK181EADBC499E38CA |            1 | item_id     | A         |      148000 |     NULL | NULL   |      | BTREE      |         |               |
+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Then I decided to execute

analyze table item_owners

And after this output was:

+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table       | Non_unique | Key_name           | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| item_owners |          0 | PRIMARY            |            1 | uid         | A         |     2005419 |     NULL | NULL   |      | BTREE      |         |               |
| item_owners |          1 | FK181EADBC7346EE24 |            1 | status_id   | A         |          17 |     NULL | NULL   |      | BTREE      |         |               |
| item_owners |          1 | FK181EADBC90094D43 |            1 | type_id     | A         |          17 |     NULL | NULL   |      | BTREE      |         |               |
| item_owners |          1 | FK181EADBC499E38CA |            1 | item_id     | A         |     2005419 |     NULL | NULL   |      | BTREE      |         |               |
+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Anyone know how it possible? , because I really misunderstand why cardinality doesn't equal number of uniq rows even for PK index.

like image 616
Andrej Soroj Avatar asked May 24 '13 10:05

Andrej Soroj


People also ask

How count distinct values in MySQL column?

To count distinct values, you can use distinct in aggregate function count(). The result i3 tells that we have 3 distinct values in the table.

What is cardinality in indexes MySQL?

Index cardinality refers to the uniqueness of values stored in a specified column within an index. MySQL generates the index cardinality based on statistics stored as integers, therefore, the value may not be necessarily exact.

Is database index useful for low cardinality?

Cardinality is important — cardinality means the number of distinct values in a column. If you create an index in a column that has low cardinality, that's not going to be beneficial since the index should reduce search space. Low cardinality does not significantly reduce search space.

What should be the function to count only unique values in MySQL?

MySQL COUNT(DISTINCT) function returns a count of number rows with different non-NULL expr values. Where expr is a given expression. The following MySQL statement will count the unique 'pub_lang' and average of 'no_page' up to 2 decimal places for each group of 'cate_id'.


1 Answers

Quoting the manual:

Cardinality

An estimate of the number of unique values in the index. This is updated by running ANALYZE TABLE or myisamchk -a. Cardinality is counted based on statistics stored as integers, so the value is not necessarily exact even for small tables.

like image 108
RandomSeed Avatar answered Sep 28 '22 19:09

RandomSeed