STATISTICS IO for parallel index scan

https://dba.stackexchange.com/questions/153455

04-10-2020
|

Question

Assume that there is a table with clustered index

create table [a_table] ([key] binary(900) unique clustered);

and some data

insert into [a_table] ([key])
select top (1000000) row_number() over (order by @@spid)
from sys.all_columns a cross join sys.all_columns b;

By inspecting storage statistics of this table

select st.index_level, page_count = sum(st.page_count)
from sys.dm_db_index_physical_stats(
    db_id(), object_id('a_table'), NULL, NULL, 'DETAILED') st
group by rollup (st.index_level)
order by grouping_id(st.index_level), st.index_level desc;

one can see

index_level page_count
----------- ----------
8           1
7           7
6           30
5           121
4           487
3           1952
2           7812
1           31249
0           125000
NULL        166659

that table takes 166659 pages in total.

However table scan

set nocount on;
set statistics io, time on;
declare @cnt int;
select @cnt = count(1) from [a_table];
set statistics io, time off;

produces

Table 'a_table'. Scan count 5, logical reads 484367, ...
CPU time = 1757 ms,  elapsed time = 460 ms.

almost three times higher number of logical reads in comparison to space taken by table. When I examined query plan, I noticed that SqlServer used parallel index scan. And this is where first part of the question arises.

How parallel index scan is performed that makes SqlServer to do so much logical reads?

Specifying option (maxdop 1) to suppress parallelism

set nocount on;
set statistics io, time on;
declare @cnt2 int;
select @cnt2 = count(1) from [a_table] option (maxdop 1);
set statistics io, time off;

resulted to

Table 'a_table'. Scan count 1, logical reads 156257, ...
CPU time = 363 ms,  elapsed time = 367 ms.

Comparing statistics for parallel and non-parallel index scan in this case leads to a conclusion that sometimes it is better to avoid parallel index scan. And this is where the second part of the question arises.

When should I worry about parallel index scan? When should it be avoided/suppressed? What are the best practices?

The above results are obtained on

Microsoft SQL Server 2014 (SP2) (KB3171021) - 12.0.5000.0 (X64)

Solution

If you add a TABLOCK or READUNCOMMITTED hint to the table, you will get an allocation-ordered scan, which will report the same number of logical reads as the number of pages in the table.

With an allocation-ordered scan, SQL Server uses allocation structures to drive the distribution of pages among threads. IAM page accesses are not counted in STATISTICS IO.

For a non-allocation-ordered scan, SQL Server breaks up the scan into subranges using index keys. Each internal operation to find and distribute a new page, or range of pages (based on a key range), to a worker thread requires accessing the upper levels of the b-tree. These accesses are counted by STATISTICS IO, as are the upper-level accesses made by worker threads when locating the start point of their current range. All these extra upper-level reads account for the difference you see.

Too much weight is given to I/O statistics in my opinion. Tune your queries according to what is important to you and your workload. This is often elapsed time, but resource utilization can also be a factor. No one metric is 'best' - you should take a balanced view.

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange