Why does my columnstore index not fill the entire rowgroup?
-
07-10-2020 - |
Pergunta
I'm using [vColumnstoreDensity]
to monitor the health of my columnstore indexes.
I've noticed that I've got one index that looks strange.
As far as my understanding goes, the columnstore index will fill a rowgroup before starting the next rowgroup. For best index performance, fuller row groups are better.
Having said all of this, I have one table that I'm having trouble understanding. Even after an index rebuild using the largest resource class (to give the maximum amount of memory possible to the build process) the [vColumnstoreDensity]
view still shows this index as being spread across many partially full rowgroups.
COMPRESSED_rowgroup_count
4936
COMPRESSED_rowgroup_rows
2693512978
COMPRESSED_rowgroup_rows_MIN
468
COMPRESSED_rowgroup_rows_MAX
739443
COMPRESSED_rowgroup_rows_AVG
545687
Edit:
This is a clustered columnstore index.
Solução
There is an Extended Event that fires when a columnstore group is being 'cut', I think the event is column_store_index_build_process_segment
. This event will have a 'trim' reason, and you should look for two possible trim causes:
- memory (unlikely for your case), if the index build does not have enough memory to build a segment.
- dictionary size (the likely cause), this happens if the dictionaries used to encode the data in a column segment reach the maximum size (16Mb).
Of course, to capture this event you need to set up an XE session during the index build (the linked article shows how).
You could also look at the post-build artifacts, specifically at the dictionaries, and see if the secondary dictionaries associated with the small segment are already at full size (16Mb). This would indicate the likely cause of trim to be full dictionary.