Not a long ago I started to optimize queries for mysql db. I created some indexes and decided to see their params, using show index from syntax. And saw that cardinality doesn't equal distinct count for some columns.
So, I have the table
CREATE TABLE `item_owners` (
`uid` varchar(255) NOT NULL,
`version` bigint(20) NOT NULL,
`type_id` varchar(255) NOT NULL,
`owner_name` varchar(255) NOT NULL,
`item_id` varchar(255) NOT NULL,
`status_id` varchar(255) NOT NULL,
PRIMARY KEY (`uid`),
KEY `FK181EADBC7346EE24` (`status_id`),
KEY `FK181EADBC90094D43` (`type_id`),
KEY `FK181EADBC499E38CA` (`item_id`),
CONSTRAINT `FK181EADBC499E38CA` FOREIGN KEY (`item_id`) REFERENCES `items` (`uid`),
CONSTRAINT `FK181EADBC7346EE24` FOREIGN KEY (`status_id`) REFERENCES `statuses` (`uid`),
CONSTRAINT `FK181EADBC90094D43` FOREIGN KEY (`type_id`) REFERENCES `types` (`uid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
And in that table I have 2M rows(and indeed 2M uniq PK), 500k uniq item_id, 5 uniq status_id and 20 uniq type_id
But when I execute
show index from item_owners
I've got
+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| item_owners | 0 | PRIMARY | 1 | uid | A | 1893427 | NULL | NULL | | BTREE | | |
| item_owners | 1 | FK181EADBC7346EE24 | 1 | status_id | A | 17 | NULL | NULL | | BTREE | | |
| item_owners | 1 | FK181EADBC90094D43 | 1 | type_id | A | 17 | NULL | NULL | | BTREE | | |
| item_owners | 1 | FK181EADBC499E38CA | 1 | item_id | A | 148000 | NULL | NULL | | BTREE | | |
+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
Then I decided to execute
analyze table item_owners
And after this output was:
+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| item_owners | 0 | PRIMARY | 1 | uid | A | 2005419 | NULL | NULL | | BTREE | | |
| item_owners | 1 | FK181EADBC7346EE24 | 1 | status_id | A | 17 | NULL | NULL | | BTREE | | |
| item_owners | 1 | FK181EADBC90094D43 | 1 | type_id | A | 17 | NULL | NULL | | BTREE | | |
| item_owners | 1 | FK181EADBC499E38CA | 1 | item_id | A | 2005419 | NULL | NULL | | BTREE | | |
+-------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
Anyone know how it possible? , because I really misunderstand why cardinality doesn't equal number of uniq rows even for PK index.
To count distinct values, you can use distinct in aggregate function count(). The result i3 tells that we have 3 distinct values in the table.
Index cardinality refers to the uniqueness of values stored in a specified column within an index. MySQL generates the index cardinality based on statistics stored as integers, therefore, the value may not be necessarily exact.
Cardinality is important — cardinality means the number of distinct values in a column. If you create an index in a column that has low cardinality, that's not going to be beneficial since the index should reduce search space. Low cardinality does not significantly reduce search space.
MySQL COUNT(DISTINCT) function returns a count of number rows with different non-NULL expr values. Where expr is a given expression. The following MySQL statement will count the unique 'pub_lang' and average of 'no_page' up to 2 decimal places for each group of 'cate_id'.
Quoting the manual:
Cardinality
An estimate of the number of unique values in the index. This is updated by running
ANALYZE TABLE
ormyisamchk -a
. Cardinality is counted based on statistics stored as integers, so the value is not necessarily exact even for small tables.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With