Is fragmentation at non-leaf levels a problem?
-
25-02-2021 - |
質問
SQL version: 2017 Standard x64 OS: Windows Server 2016 Datacenter (10.0)
I've just created some minimal indexes on ETL tables which were previously heaps with no indexes. Both R and W performance have improved. Thought I'd keep on eye on index fragmentation, so I'm looking at sys.dm_db_index_physical_stats: in DETAILED mode simply to learn more about this area. I'm an experienced DWH developer rather than a DBA, so my questions may be very beginner to DBAs (there is no DBA here, or they'd be taking care of this).
The index I'm looking at is a clustered unique index. This table has 1.3m rows. At the leaf level (0) I see .89% fragmentation: good. At higher levels though there is fragmentation. Index has levels 0-3. At level 2 fragmentation is 52%, fragment_count=23, record_count=2781.
This is confusing me, especially as I only created the index yesterday.
- Is fragmentation at non-leaf levels a performance concern?
- How can there be so much fragmentation at level 2 already?
- In a clustered index, what do non-leaf (>0) levels in the B-tree actually store? It can't be row data as in level 0: is it a pointer to a set of lower index entries?
- If I'm right about 3, it would seem a trivial job to reorganise/rebuild an upper level of a B-tree, e.g. one with only 2781 records, without going near the heavy-duty work of reorganising the leaf level. But I can't see any mention of a way to do this.
- What does fragment_count actually mean, especially at higher levels? The documentation only says that logical fragmentation is "the percentage of out-of-order pages in the leaf pages of an index". Is fragment_count a count of contiguous areas of the index, which are in-order internally?
解決
- No. Non-leaf index pages are particularly likely to stay in cache. In principle, it might slow down read-ahead a little (upper levels are used to drive this) the first time an index scan with read-ahead occurs, but meh. Some people regard fragmentation at any level a normal thing and not something to worry about if the data is usually already in memory or if the storage subsystem has low latency and good throughput.
- Because SQL Server doesn't go out of its way to avoid it. Non-leaf pages can experience page splits too. Padding the index above the leaf (
PAD_INDEX
) is possible but rarely worthwhile. - The clustered index key(s) and any uniquifier.
- This isn't implemented.
- I believe it is a count of the number of contiguous areas at that level of the index, yes.