Query performance question
Domanda
Using Demo from here to reproduce my issue with changes in table structure as below and modifying demo partition function to datetime
CREATE TABLE [dbo].[DemoPartitionedTable](
[DemoID] [int] IDENTITY(1,1) NOT NULL,
[SomeData] [sysname] NOT NULL,
[CaptureDate] [datetime] NULL,
CONSTRAINT [PK_DemoPartitionedTable] UNIQUE NONCLUSTERED
(
[DemoID] ASC,
[CaptureDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)
GO
Users run query like below (Query 1)
SELECT [DemoID], [SomeData], [CaptureDate] FROM
[dbo].[DemoPartitionedTable] WHERE (1=1) and (1=1) and
CONVERT(varchar(10),CaptureDate,112) between 20190912 and 20190912
Plan is https://www.brentozar.com/pastetheplan/?id=r1veZhovH
Takes about 4 hours to return 5 million rows
To improve above code if i used partition key as below (Query 2)
SELECT [DemoID], [SomeData], [CaptureDate] FROM
[dbo].[DemoPartitionedTable] WHERE (1=1) and (1=1) and
CaptureDate>= cast('2019-09-12' as date) and
CaptureDate< cast('2019-09-13' as date)
This completes in around 40 minutes
PLan for this https://www.brentozar.com/pastetheplan/?id=BkDYl3svB
Now if i run below, it completes in 10 minutes and scan directly the specific partition of the day as required compared to first 2 queries which scanned all the partitions (Query 3)
SELECT [DemoID],
[SomeData],
[CaptureDate]
FROM [dbo].[DemoPartitionedTable]
WHERE (1=1)
AND (1=1)
AND $partition.DemoPartitionFunction(CaptureDate)>=$partition.DemoPartitionFunction('09/12/2019')
AND $partition.DemoPartitionFunction(CaptureDate)<$partition.DemoPartitionFunction('09/13/2019')
PLan https://www.brentozar.com/pastetheplan/?id=S1j9CjsPB
The problem with above is users have to find the PF name before querying as used in my query 3. Why cant query 2 perform similar to query and make use of partition elimination? Is there a way query can be coded to find PF name rather than hard coding in query ?
Please advise
Soluzione
The table in the question isn't partitioned. I assume the intended definitions are:
CREATE PARTITION FUNCTION DemoPartitionFunction (datetime)
AS RANGE RIGHT
FOR VALUES (DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -7),
DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -6),
DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -5),
DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -4),
DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -3),
DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -2),
DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -1),
DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 0),
DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 1),
DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 2),
DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 3),
DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 4),
DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 5),
DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 6),
DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 7));
CREATE PARTITION SCHEME DemoPartitionScheme
AS PARTITION DemoPartitionFunction
ALL TO ([DEFAULT]);
CREATE TABLE [dbo].[DemoPartitionedTable](
[DemoID] [int] IDENTITY(1,1) NOT NULL,
[SomeData] [sysname] NOT NULL,
[CaptureDate] [datetime] NULL,
CONSTRAINT [PK_DemoPartitionedTable] UNIQUE NONCLUSTERED
(
[DemoID] ASC,
[CaptureDate] ASC
)
ON DemoPartitionScheme(CaptureDate)
) ON DemoPartitionScheme(CaptureDate);
Query 1
SELECT [DemoID], [SomeData], [CaptureDate] FROM
[dbo].[DemoPartitionedTable] WHERE (1=1) and (1=1) and
CONVERT(varchar(10),CaptureDate,112) between 20190912 and 20190912
Aside from the redundant 1=1
predicates, the date comparison is a mess of mismatched types. The datetime
column CaptureDate is converted to varchar(10)
without an explicit style, then compared to integers
.
The residual predicate on the heap scan reflects this confusion:
CONVERT_IMPLICIT(int,CONVERT(varchar(10),[dbo].[DemoPartitionedTable].[CaptureDate],112),0)>=(20190912)
AND CONVERT_IMPLICIT(int,CONVERT(varchar(10),[dbo].[DemoPartitionedTable].[CaptureDate],112),0)<=(20190912)
These predicates cannot use an index, nor do they help SQL Server with partition elimination.
Query 2
SELECT [DemoID], [SomeData], [CaptureDate] FROM
[dbo].[DemoPartitionedTable] WHERE (1=1) and (1=1) and
CaptureDate>= cast('2019-09-12' as date) and
CaptureDate< cast('2019-09-13' as date)
This is a little better. It avoids converting the columns, but for some reason chooses to compare the datetime
CaptureDate column with a date
data type. These are different data types. There isn't a warning about type conversion in the plan because SQL Server can convert the date
typed literals to datetime
during compilation. Nevertheless, the type inconsistency is enough to prevent partition elimination.
Writing the query correctly
SELECT
DPT.DemoID,
DPT.SomeData,
DPT.CaptureDate
FROM dbo.DemoPartitionedTable AS DPT
WHERE
1 = 1
AND DPT.CaptureDate >= CONVERT(datetime, '20190912', 112)
AND DPT.CaptureDate < CONVERT(datetime, '20190913', 112);
The above query compares the datetime
CaptureDate column with a datetime
literal. Notice the use of an explicit style, so SQL Server knows the format of the string. You should generally prefer CONVERT
with the correct style to CAST
when working with dates and times.
The 1 = 1
predicate is there to avoid simple parameterization.
This query allows SQL Server to perform partition elimination, as shown by the Seek Predicate on the heap table:
Seek Keys[1]: Prefix: PtnId1001 = Scalar Operator((1))
There is still a residual predicate applied to each row in the partition:
[dbo].[DemoPartitionedTable].[CaptureDate] as [DPT].[CaptureDate]>='2019-09-12 00:00:00.000'
AND [dbo].[DemoPartitionedTable].[CaptureDate] as [DPT].[CaptureDate]<'2019-09-13 00:00:00.000'
This is because the table lacks a covering index with CaptureDate as the leading key. If you want the query to be as fast as possible, you should create an index like that.
For more details please see my article Why Doesn’t Partition Elimination Work?
Please pay careful attention to data types when writing queries.