Which is the most efficient way to run a particular select query?
-
16-10-2019 - |
Question
If I run the following code:
select PolicyNumber, MAX(decpageid) as decpageid, Risk
from StatRiskDecpages
where PolicyNumber = 'AR-0000301132-04'
group by PolicyNumber, Risk
I get the following results:
PolicyNumber decpageid Risk
AR-0000301132-04 41 1
AR-0000301132-04 41 2
AR-0000301132-04 37 3
All I really want to retrieve though is the policynumber and the maximum decpageid (which in this case would be 41) along with the Risk numbers (Which should be 1 and 2)
The query is also returning decpageid 37 even though it is not the maximum decpageid for the policynumber because it has a different risk.
The results I would like returned are:
PolicyNumber DecpageID Risk
AR-0000301132-04 41 1
AR-0000301132-04 41 2
I have figured out 2 different queries I can use to return my desired results but I don't think they are the most efficient. The queries I came up with are:
select PolicyNumber, MAX(decpageid) as decpageid, Risk
from StatRiskDecpages
where
PolicyNumber = 'AR-0000301132-04'
and decpageid = (
select MAX(decpageid)
from StatRiskDecpages
where PolicyNumber = 'AR-0000301132-04'
)
;
This returns the desired results but I don't want to have to specify the policy number more then once in the query. Is there a way to call the policynumber into the sub query from the outer query?
The other query I came up with was:
select t1.PolicyNumber,t2.DecpageID, t2.Risk
from (
select PolicyNumber, MAX(decpageid) as decpageid
from StatRiskDecpages
where PolicyNumber = 'AR-0000301132-04'
group by PolicyNumber
) as t1
left join StatRiskDecpages as t2
on t1.PolicyNumber = t2.PolicyNumber
and t1.decpageid = t2.DecpageID
;
I like this query because I only have to specify the policynumber 1 time and I can also expand the query so I can return the info for multiple policynumbers.
What I need to know is if this is the most efficient way of writing the query? It seems a little redundant. I might be wrong but I think there might be a better more efficient way of writing the query. Any suggestions?
Solution
I'd probably use
SELECT TOP (1) WITH TIES PolicyNumber,
decpageid,
Risk
FROM StatRiskDecpages
WHERE PolicyNumber = 'AR-0000301132-04'
ORDER BY decpageid DESC
Assuming the covering index on (PolicyNumber, decpageid) INCLUDE(Risk)
this will give you a plan like
OTHER TIPS
Here's how I refactored your query. (first I'll post your query with your execution plan, and then follows is my new query and the corresponding query plan)
Your Query
Select
t1.PolicyNumber,
t2.DecpageID,
t2.Risk
from
(
select
PolicyNumber,
MAX(decpageid) as decpageid
from StatRiskDecpages
where PolicyNumber = 'AR-0000301132-04'
group by PolicyNumber
) as t1
left join StatRiskDecpages as t2
on t1.PolicyNumber = t2.PolicyNumber
and t1.decpageid = t2.DecpageID
The execution plan for your query is as follows:
And then I wrote the query like this, giving the same desired results:
My Query
select
p.PolicyNumber,
p.decpageid,
p.Risk
from StatRiskDecpages p
where decpageid in
(
select max(decpageid)
from StatRiskDecpages
where PolicyNumber = p.PolicyNumber
)
This shows an execution plan of this:
Notice the single Table Scan (as opposed to your two Table Scans), as well as the lack of the Nested Loops for the JOIN.
As for what looks better and more maintainable, I think they are both relatively workable. Just some ideas and graphical representation of an alternative.
@ypercube makes a good point that an elegant looking query does not necessarily mean it will be more efficient. I have no idea about your indexes or data size but you may want to compare the following window function version against your existing queries (I would check both duration and the quality of the plans):
;WITH x AS
(
SELECT PolicyNumber, decpageid, Risk,
dr = DENSE_RANK() OVER (ORDER BY decpageid DESC)
FROM dbo.StatRiskDecpages
WHERE PolicyNumber = 'AR-0000301132-04'
)
SELECT PolicyNumber, decpageid, Risk
FROM x
WHERE dr = 1;
You may also consider leaving the PolicyNumber out of the result set. You already know what it is, since you passed it in, and it's the same redundant data on every row.