Can this SQL statement be Optimized?
-
03-06-2021 - |
質問
This seems to be taking a very long time on large data sets. Will combining the first and last 3 queries into 1 make it faster? Does anyone have any input on what might make it faster? I appreciate it.
update "detail" set bal = (units * amount) where code_type = 'AB';
update "detail" set bal = (units * amount) where code_type = 'CD';
update "detail" set bal = (units * amount) where code_type = 'EF';
update "detail" set bal = (units * amount * -1) where code_type = 'GH';
update "detail" set bal = (units * amount * -1) where code_type = 'IK';
update "detail" set bal = (units * amount * -1) where code_type = 'LM';
update "detail" set bal = 0 where code_type = 'NO';
Additionally -
update bill set pay =
(select round(sum(bd1.bal),2) from "detail" bd1 where
bd1.inv = bill.inv and
(bd1.code_type = 'AB' or bd1.code_type = 'CD'));
update bill set pay = 0 where pay is null;
update bill set cost =
(select round(sum(bd2.bal),2) from "detail" bd2 where
bd2.inv = bill.inv and
(not (bd2.code_type = 'AB' or bd2.code_type = 'CD')));
update bill set cost = 0 where cost is null;
update bill set balance = round(cost + pay,2);
Thanks
解決
Performance probably stinks because you are updating the entire table, and you are updating it twelve times. If the table is seriously big, that's going to take time. Also, those two embedded subqueries will each get run once per row. Ouch.
The following frankenquery rolls everything into a single statement. It still has to hit the entire table, but at least it only does it once. I can't check syntax or test it against data, but this or something very much like it should work.
EDITED, split this into two updates (thus, two table scans requried)
UPDATE Detail
set
bal = case
when code_type in ('AB','CD','EF') then bi.units * bi.amount
when code_type in ('gh','ik','lm') then -bi.units * bi.amount
when code_type = 'NO' then 0
else bal -- If none of the above, change nothing
end
and
UPDATE Bill
set
payments = isnull(bd1.amount, payments) -- This changes nothing if nothing was calculated
,pay = case
when pay is null then 0
else pay
end
-- Ok, problem with cost: what if calculated amount is 0 and current value is non-zero?
-- I've insufficient data here to correctly resolve all the possible permutations
,cost = case
when bd2.amount is not null then cost
when cost is null then 0
else cost
end
,balance = round(charges + isnull(bd1.amount, bi.payments), 2)
from Bill bi
-- These subqueries could be combined into one using similar CASE logic,
-- and it would probably perform better (one table scan vs. two). I'm
-- leaving it this way for demonstration, and to keep the overall query
-- a bit simpler.
left outer join (select
inv
,round(sum(bd1.bal), 2) amount
from detail
where code_type = 'AB'
or code_type = 'CD'
group by inv) bd1
on bd1.inv = bi.inv -- ADDED in second edit
left outer join (select
inv -- RENAMED in second edit
,round(sum(bd2.bal), 2) amount
from detail
where code_type <> 'AB'
and code_type <> 'CD'
group by inv) bd2 -- RENAMED in second edit
on bd2.invoive = bi.inv -- ADDED in second edit
Moral: the CASE
statement can be the SQL developer's best friend.
他のヒント
1st query could be written like this:
UPDATE "detail"
SET bal =
CASE
WHEN code_type = 'NO'
THEN 0
ELSE
CASE
WHEN code_type IN ('AB',
'CD',
'EF')THEN 1
ELSE -1
END
END * (units * amount)
WHERE code_type IN ('AB','CD','EF','GH','IK','KL','NO');
Having an index on code_type would quickly filter out the qualifying rows you'll need to update. Depending on the rest of the table structure, your speed might vary, but i think this would be the fastest version. since the only thing that matters here is the 0, 1 or -1 based on code type, that's what you check against, then multiply with the units*amount.
Update: Your 2nd batch of updates can also be written in one batch:
UPDATE b
SET payments = COALESCE(x.bal,0),
cost = COALESCE(y.bal,0),
balance = ROUND(charges + COALESCE(x.bal,0),2)
FROM bill b
LEFT OUTER JOIN
( SELECT ROUND(SUM(bd1.bal),2) AS bal,
inv
FROM "detail" bd1
WHERE bd1.code_type IN ('AB',
'CD')
GROUP BY bd1.inv
)
x
ON x.inv = b.inv
LEFT OUTER JOIN
( SELECT ROUND(SUM(bd2.bal),2) AS bal,
invoice
FROM "detail" bd2
WHERE bd2.code_type NOT IN ('AB',
'CD')
GROUP BY bd2.invoice
)
y
ON y.invoice = b.invoice
Tips for speed improvements:
- Index on detail table, code_type and inv column, include bal in the index
- Index on detail table, code_type and invoice column, include bal in the index.
- Index on bill table, inv and invoice.
I think first 3 statements you can do it in single statement like this:
update detail set bal = (units * amount) where code_type in( 'AB','CD' )
Same you can do for next 3 statements also.
Currently you are recalculating units * amount
and units * amount * -1
for each query. Calculating units * amount
once and combining the first few queries should improve performance, though I don't know by how much:
declare @total int
set @total = units * amount
update "detail" set bal = @total where code_type in ('AB', 'CD', 'EF');
update "detail" set bal = (@total * -1) where code_type in ('GH', 'IK','LM');
update "detail" set bal = 0 where code_type = 'NO';