How does MySQL skip ranges (nodes) in a loose index scan?

https://dba.stackexchange.com/questions/257900

22-02-2021
|

Question

I've read the MySQL loose index scan, and it makes sense as optimization, however, when using an example B+tree, I don't understand how nodes can be efficiently skipped, due to potential matches causing many traverses to be performed.

Take this B+-tree:

                                                                 +-------------+
                                                                 |A            |
                       +-----------------------------------------+ (5,2) (7,0) +----------------------------------------+
                       |                                         |             |                                        |
                       |                                         +------+------+                                        |
                       |                                                |                                               |
                       |                                                |                                               |
                +------v------+                                  +------v------+                                 +------v------+
                |B            +---------------------------------->C            +--------------------------------->D            |
       +--------+ (2,0) (2,2) +----------+             +---------+ (5,2) (5,4) +---------+              +--------+ (7,0) (8,1) +---------+
       |        |             |          |             |         |             |         |              |        |             |         |
       |        +------+------+          |             |         +------+------+         |              |        +------+------+         |
       |               |                 |             |                |                |              |               |                |
       |               |                 |             |                |                |              |               |                |
+------v------+ +------v------+  +-------v-----+ +-----v-------+ +------v------+ +-------v-----+ +------v------+ +------v------+ +-------v-----+
|E            | |F            |  |G            | |H            | |I            | |L            | |M            | |N            | |O            |
| (0,0) (1,0) +-> (2,0) (2,1) +--> (2,2) (3,0) +-> (4,0) (5,1) +-> (5,2) (5,3) +-> (5,4) (5,5) +-> (6,0) (6,1) +-> (7,0) (8,0) +-> (8,1) (8,2) |
|             | |             |  |             | |             | |             | |             | |             | |             | |             |
+-------------+ +-------------+  +-------------+ +-------------+ +-------------+ +-------------+ +-------------+ +-------------+ +-------------+

and the query SELECT (c1), MIN(c2) ... GROUP BY c1.

Now, based on my understanding, a loose index scan will skip nodes when it is certain that a subtree doesn't include values that won't affect the (current) aggregate result.

With the tree above, I reckon the traversal will be:

A
B
E
- find (0,0) (1,0)
backtrack
- find (2,0)
skip F
G
- find (3,0)
backtrack
H
- find (4,0) (5,1)
backtrack
skip I
L
backtrack
D
M
- find (6,0)
backtrack
- find (7,0)
N
- find (8,0)
O

Assuming that a backtrace has no cost, isn't in this example less expensive to perform a tight index scan (that is, navigate all the leaves directly)?

Is there any mistake in the traversal logic above? Or is this an excessively pessimistic (therefore, non representative) example?

Solution

I think the Optimizer works something like this:

It can find the first row: MIN(name) WHERE foo=123
It can find the last row: MAX(name) WHERE foo=123

Those are done via a drill-down in the BTree, and assumed to be reasonably cheap. If there are a lot of rows for foo=123, then this is likely to skip over some blocks. Note: fetching blocks potentially means a disk hit, and disk hits are the most costly primitive operation. Note also that much of the Optimizer was designed in the days of small RAM and before SSDs.

Once it gets to the first (or last) row, then the B+Tree shines in that it can walk forward (or backward) efficiently. (But this is not relevant to your Question.)

If, as you imply in your question, there are not many rows where foo=123, then there is not much savings, perhaps no savings.

(What I don't know here is whether and when the Optimizer decides that "loose scan" is likely to be inefficient. I do know that the Optimizer will deliberately choose to do a table scan instead of using an INDEX when it estimates that the back and forth between the index BTree and the data BTree is too costly.)

A Rule of Thumb: There are 100 items in each node of the BTree. (This of course really varies between 1 and several hundred, but 100 is a handy number for estimating things.) So, a 3-level BTree like you diagrammed may be holding 1 million items and have 10,000 leaf nodes. Now, depending on the specifics of the schema and the query, a table scan would cost 10,000 disk hits versus only a few disk hits for a loose scan. (Yeah, the table is likely to be completely cached in the buffer_pool; I do know that the Optimizer does not yet take that into account in its 'cost model'.)

OTHER TIPS

Loose index scan works because the first record in a group satisfies the query, so there is no need to read other records in the group.

If it helps - there are more details on files format and how it impacts queries in my slides - https://www.slideshare.net/akuzminsky/indexes-in-my-sql-aleksandr-kuzminsky-https-twindbcom

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange