Filter condition not correctly applied to Clustered Columnstore index
-
24-01-2021 - |
Question
Using the below example, the predicates are the same, however the top statement (correctly) returns 0 rows, the bottom statement returns 1 - even though the predicates do NOT match:
declare @barcode nchar(22)=N'RECB012ZUKI449M1VBJZ'
declare @tableId int = null
declare @total decimal(10, 2) = 5.17
SELECT 1
FROM
[dbo].[transaction] WITH (INDEX([IX_Transaction_TransactionID_PaymentStatus_DeviceID_DateTime_All]))
WHERE
Barcode = @barcode
AND StatusID = 1
AND TableID = @tableID
AND @total <= Total
SELECT 1
FROM
[dbo].[transaction]
WHERE
Barcode = @barcode
AND StatusID = 1
AND TableID = @tableID
AND @total <= Total
Why could this be happening?
Further info:
- The Non Clustered Index in the top statement is NOT filtered
- CheckDB returns 0 issues
- Server Version:
Microsoft SQL Azure (RTM) - 12.0.2000.8 Dec 19 2018 08:43:17 Copyright (C) 2018 Microsoft Corporation
Paste the Plan link:
https://www.brentozar.com/pastetheplan/?id=S1w_rU68E
Further Info:
Have ran dbcc checktable ([transaction]) with all_errormsgs, extended_logical_checks, data_purity
which indicates no issues.
I can reliably reproduce the issue against this table when restoring a backup of this database.
Solution
This bug does not require dropping or renaming columns.
You will also see the same behaviour for statusId = 100
which was never present in any version of the column.
Requirements
- A clustered columnstore
- Nonclustered b-tree index
- A plan that performs a lookup on the columnstore with
- Target row(s) in delta store
- A pushed non-SARG predicate
- A comparison with NULL using an equality test
Example
DROP TABLE IF EXISTS dbo.Example;
GO
CREATE TABLE dbo.Example
(
c1 integer NOT NULL,
c2 integer NULL,
INDEX CCS CLUSTERED COLUMNSTORE,
INDEX IX NONCLUSTERED (c1)
);
GO
INSERT dbo.Example
(c1, c2)
VALUES
(1, NULL);
GO
DECLARE @c2 integer = NULL;
-- Returns one row but should not
SELECT
E.*
FROM dbo.Example AS E
WITH (INDEX(IX))
WHERE
E.c2 = @c2;
Any of the following will avoid the bug:
- Moving rows out of the delta store using any method including reorganizing with the compress rowgroups option specified
- Writing the predicate to explicitly reject
= NULL
- Enabling undocumented trace flag 9130 to avoid pushing the predicate into the lookup
db<>fiddle demo.
This bug was fixed in CU15 for SQL Server 2017 (and CU7 for SQL Server 2016 SP2):
OTHER TIPS
This is a bug with SQL Server. If a column is deleted from a table with a clustered columnstore index, and then a new column is added with the same name, it appears to be using the old, deleted column for the predicate. Here is the MVCE:
This script starts off with 10000
rows with statusId
of 1
and statusId2
of 5
- then drops the statusID
column and renames statusId2
to statusId
. So at the end all rows should have a statusId
of 5.
But the following query hits the non clustered index...
select *
from example
where statusId = 1
and total <= @filter
and barcode = @barcode
and id2 = @id2
... and returns 2
rows (with the selected statusId
different from implied by the WHERE
clause)...
+-------+---------+------+-------+----------+
| id | barcode | id2 | total | statusId |
+-------+---------+------+-------+----------+
| 5 | 5 | NULL | 5.00 | 5 |
| 10005 | 5 | NULL | 5.00 | 5 |
+-------+---------+------+-------+----------+
... whereas this one accesses the columnstore and correctly returns 0
select count(*)
from example
where statusId = 1
MVCE
/*Create table with clustered columnstore and non clustered rowstore*/
CREATE TABLE example
(
id INT IDENTITY(1, 1),
barcode CHAR(22),
id2 INT,
total DECIMAL(10,2),
statusId TINYINT,
statusId2 TINYINT,
INDEX cci_example CLUSTERED COLUMNSTORE,
INDEX ix_example (barcode, total)
);
/* Insert 10000 rows all with (statusId,statusId2) = (1,5) */
INSERT example
(barcode,
id2,
total,
statusId,
statusId2)
SELECT TOP (10000) barcode = row_number() OVER (ORDER BY @@spid),
id2 = NULL,
total = row_number() OVER (ORDER BY @@spid),
statusId = 1,
statusId2 = 5
FROM sys.all_columns c1, sys.all_columns c2;
ALTER TABLE example
DROP COLUMN statusid
/* Now have 10000 rows with statusId2 = 5 */
EXEC sys.sp_rename
@objname = N'dbo.example.statusId2',
@newname = 'statusId',
@objtype = 'COLUMN';
/* Now have 10000 rows with StatusID = 5 */
INSERT example
(barcode,
id2,
total,
statusId)
SELECT TOP (10000) barcode = row_number() OVER (ORDER BY @@spid),
id2 = NULL,
total = row_number() OVER (ORDER BY @@spid),
statusId = 5
FROM sys.all_columns c1, sys.all_columns c2;
/* Now have 20000 rows with StatusID = 5 */
DECLARE @filter DECIMAL = 5,
@barcode CHAR(22) = '5',
@id2 INT = NULL;
/*This returns 2 rows from the NCI*/
SELECT *
FROM example WITH (INDEX = ix_example)
WHERE statusId = 1
AND total <= @filter
AND barcode = @barcode
AND id2 = @id2;
/*This counts 0 rows from the Columnstore*/
SELECT COUNT(*)
FROM example
WHERE statusId = 1;
I have also raised an issue on Azure feedback portal:
And for anyone else who encounters this, rebuilding the Clustered Columnstore Index fixes the problem:
alter index cci_example on example rebuild
Rebuilding the CCI only fixes any existing data. If new records are added, the issue arises again on these records; so currently the only known fix for the table is to recreate it entirely.
Based on the plans, it appears that the Columnstore index was created with SET ANSI_NULLS OFF. Tables and indexes retain the setting as it was when the index was created. You can check by creating a duplicate Columnstore index while ensuring that ANSI_NULLS is ON, then either dropping the original or disabling it.
But, unless you have discovered a SQL Server bug, this is the only way that the results could happen.