Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is cardinality in MySQL?

Tags:

indexing

mysql

People also ask

What does cardinality in SQL mean?

In SQL, cardinality refers to the uniqueness of data in a specific column of a table. A table would be said to have less cardinality if it has more duplicated data in a column. So, more the cardinality less the data duplication (in a column) of SQL database table. In databases, the term data cardinality is used.

How cardinality is calculated in MySQL?

Cardinality is counted based on statistics stored as integers, so the value is not necessarily exact even for small tables. The higher the cardinality, the greater the chance that MySQL uses the index when doing joins.

What is cardinality of index?

The number of rows in the table. The number of unique values for a set of columns for leading columns in an index key, also known as cardinality. Leading columns refers to the first column, or the first and second column, or the first, second, and third column of an index (and so on).

What is cardinality with example?

Cardinality refers to the number that is obtained after counting something. Thus, the cardinality of a set is the number of elements in it. For example, the set {1, 2, 3, 4, 5} has cardinality five which is more than the cardinality of {1, 2, 3} which is three.


Max cardinality: All values are unique

Min cardinality: All values are the same

Some columns are called high-cardinality columns because they have constraints in place (like unique) prohibiting you from putting the same value in every row.

Cardinality is a property which affects the ability to cluster, sort and search data. It is therefore an important measurement for the query planners in DBs, it is a heuristic which they can use to choose the best plans.


Wikipedia summarizes cardinality in SQL as follows:

In SQL (Structured Query Language), the term cardinality refers to the uniqueness of data values contained in a particular column (attribute) of a database table. The lower the cardinality, the more duplicated elements in a column. Thus, a column with the lowest possible cardinality would have the same value for every row. SQL databases use cardinality to help determine the optimal query plan for a given query.


It is an estimate of the number of unique values in the index.

For a table with a single primary key column, the cardinality should normally be equal to the number of rows in the table.

More information.


It's basically associated with the degree of uniqueness of a column's values as per the Wikipedia article linked to by Kami.

Why it is important to consider is that it affects indexing strategy. There will be little point indexing a low cardinality column with only 2 possible values as the index will not be selective enough to be used.


The higher cardinality, the better is differentiation of rows. Differentiation helps navigating less branches to get data.

Therefore higher cordinality values mean:

  • better performance of read-queries;
  • bigger database size;
  • worse performance of write-queries, because hidden index data is being updated.