سؤال

I have problems with SQL performance. For sudden reason the following queries are very slow:

I have two lists which contains Id's of a certain table. I need to delete all records from the first list if the Id's already exists in the second list:

DECLARE @IdList1 TABLE(Id INT)
DECLARE @IdList2 TABLE(Id INT)

-- Approach 1
DELETE list1
FROM @IdList1 list1
INNER JOIN @IdList2 list2 ON list1.Id = list2.Id

-- Approach 2
DELETE FROM @IdList1
WHERE Id IN (SELECT Id FROM @IdList2)

It is possible the two lists contains more than 10.000 records. In that case both queries takes each more than 20 seconds to execute.

The execution plan also showed something I don't understand. Maybe that explains why it is so slow: Queryplan of both queries

I Filled both lists with 10.000 sequential integers so both list contained value 1-10.000 as starting point.

As you can see both queries shows for @IdList2 Actual Number of Rows is 50.005.000!!. @IdList1 is correct (Actual Number of Rows is 10.000)

I know there are other solutions how to solve this. Like filling a third list instaed of removing from first list. But my question is:

Why are these delete queries so slow and why do I see these strange query plans?

هل كانت مفيدة؟

المحلول

Add a Primary key to your table variables and watch them scream

DECLARE @IdList1 TABLE(Id INT primary Key not null)
DECLARE @IdList2 TABLE(Id INT primary Key not null)

because there's no index on these table variables, any joins or subqueries must examine on the order of 10,000 times 10,000 = 100,000,000 pairs of values.

نصائح أخرى

SQL Server compiles the plan when the table variable is empty and does not recompile it when rows are added. Try

DELETE FROM @IdList1
WHERE Id IN (SELECT Id FROM @IdList2)
OPTION (RECOMPILE)

This will take account of the actual number of rows contained in the table variable and get rid of the nested loops plan

Of course creating an index on Id via a constraint may well be beneficial for other queries using the table variable too.

The tables in table variables can have primary keys, so if your data supports uniqueness for these Ids, you may be able to improve performance by going for

DECLARE @IdList1 TABLE(Id INT PRIMARY KEY)
DECLARE @IdList2 TABLE(Id INT PRIMARY KEY)

Possible solutions:

1) Try to create indices thus

1.1) If List{1|2}.Id column has unique values then you could define a unique clustered index using a PK constraint like this:

DECLARE @IdList1 TABLE(Id INT PRIMARY KEY);
DECLARE @IdList2 TABLE(Id INT PRIMARY KEY);

1.2) If List{1|2}.Id column may have duplicate values then you could define a unique clustered index using a PK constraint using a dummy IDENTITY column like this:

DECLARE @IdList1 TABLE(Id INT, DummyID INT IDENTITY, PRIMARY KEY (ID, DummyID) );
DECLARE @IdList2 TABLE(Id INT, DummyID INT IDENTITY, PRIMARY KEY (ID, DummyID) );

2) Try to add HASH JOIN query hint like this:

DELETE list1
FROM @IdList1 list1
INNER JOIN @IdList2 list2 ON list1.Id = list2.Id
OPTION (HASH JOIN);

You are using Table Variables, either add a primary key to the table or change them to Temporary Tables and add an INDEX. This will result in much more performance. As a rule of thumb, if the table is only small, use TABLE Variables, however if the table is expanding and contains a lot of data then either use a temp table.

I'd be tempted to try

DECLARE @IdList3 TABLE(Id INT);

INSERT @IdList3
SELECT Id FROM @IDList1 ORDER BY Id
EXCEPT
SELECT Id FROM @IDList2 ORDER BY Id

No deleting required.

Try this alternate syntax:

DELETE deleteAlias
FROM @IdList1 deleteAlias
WHERE EXISTS (
        SELECT NULL
        FROM @IdList2 innerList2Alias
        WHERE innerList2Alias.id=deleteAlias.id
    )

EDIT.....................

Try using #temp tables with indexes instead.

Here is a generic example where "DepartmentKey" is the PK and the FK.

IF OBJECT_ID('tempdb..#Department') IS NOT NULL
begin
        drop table #Department
end


CREATE TABLE #Department 
( 
    DepartmentKey int , 
    DepartmentName  varchar(12)
)



CREATE INDEX IX_TEMPTABLE_Department_DepartmentKey ON #Department (DepartmentKey)




IF OBJECT_ID('tempdb..#Employee') IS NOT NULL
begin
        drop table #Employee
end


CREATE TABLE #Employee 
( 
    EmployeeKey int , 
    DepartmentKey int ,
    SSN  varchar(11)
)



CREATE INDEX IX_TEMPTABLE_Employee_DepartmentKey ON #Employee (DepartmentKey)


Delete deleteAlias 
from #Department deleteAlias
where exists ( select null from #Employee innerE where innerE.DepartmentKey = deleteAlias.DepartmentKey )





IF OBJECT_ID('tempdb..#Employee') IS NOT NULL
begin
        drop table #Employee
end

IF OBJECT_ID('tempdb..#Department') IS NOT NULL
begin
        drop table #Department
end
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top