Question

I have created an compound index on mysql, here is the command that i use: create index deliver_aid_sid_rcnt_idx on DELIVER_SM(AID,STATUSID,RETRY_CNT) USING BTREE; The weird is that i got different cardinality value for each field. is that normal? i have created other compound index as well and the cardinality value for each field is the same.

+--------------+-----------------+-------------+--------------------------+-------------+
| TABLE_SCHEMA | TABLE_NAME      | COLUMN_NAME | INDEX_NAME               | CARDINALITY |
+--------------+-----------------+-------------+--------------------------+-------------+
| prddb        | DELIVER_SM      | AID         | deliver_aid_sid_rcnt_idx |          28 |
| prddb        | DELIVER_SM      | STATUSID    | deliver_aid_sid_rcnt_idx |         286 |
| prddb        | DELIVER_SM      | RETRY_CNT   | deliver_aid_sid_rcnt_idx |         286 |
+--------------+-----------------+-------------+--------------------------+-------------+
 
Was it helpful?

Solution

Index cardinality is not a number of distinct values of columns, it is rather a number of nodes in the b-tree index.

Consider below example:

CREATE TABLE abc(   a int, b int, c int );

set @x = 0;
INSERT INTO abc( a, b, c ) 
SELECT (@x:=@x+1),
       round( @x / 10 ),
       round( @x / 100 )
FROM information_schema.columns
LIMIT 421;

CREATE INDEX ix1 ON abc( a, b, c );
CREATE INDEX ix2 ON abc( c, b, a );

ANALYZE TABLE abc;

and queries that show cardinality of indexes:

SELECT COUNT( distinct a) a,
       COUNT( distinct b) b,
       COUNT( distinct c) c,
       COUNT( * )
FROM abc;

SELECT table_name, index_name, column_name, cardinality
FROM INFORMATION_SCHEMA.STATISTICS
WHERE table_name = 'abc' AND index_name = 'ix1';

SELECT table_name, index_name, column_name, cardinality
FROM INFORMATION_SCHEMA.STATISTICS
WHERE table_name = 'abc' AND index_name = 'ix2';


Look at this demo to see results: http://www.sqlfiddle.com/#!2/b5987/1

The table has 421 rows.
Column a has 421 distinct values.
Column b has 43 distinct values.
Column c has 5 distinct values.

I am a poor drawer so I don't attach drawings of these b-tree indexes here :)
But I hope you can imagine a picture of a b-tree index in your head, like in this link: http://docs.oracle.com/cd/E11882_01/server.112/e25789/indexiot.htm
(BTW I recommend you to study this material, it's oracle related, not MySql, but it is great explanation how indexes work and how they are organized). enter image description here

For ix1 ON abc( a, b, c ) index MySql shows following cardinalites:
a --> 407
b --> 407
c --> 407

Remember that cardinality numbers are not exact values, but rather estimates.

Here a is a leading column in the index (this is the column with the highest number of distinct values), and because of this it creates a high number of top level nodes in the index. Remaining columns (their values) are also stored in (or maybe "under) these top-level index nodes.

Hovever for ix2 ON abc( c, b, a ) estimate values of cardinality are different:
c --> 9
b --> 101
a --> 407

In this case, where c is the leading column of the index, MySql "thinks" (estimates) that the index has 9 "top level" nodes, values of b take 101 nodes "below c", and a takes 407 nodes in the index.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top