How to group data correctly
質問
My ultimate aim is to have a table that looks like this:
Maturity Band AAA AA A A- BBB+ BBB- BB+ BB Total
Less Than 1 yr 2.63% 5% 2% 9.63%
1 to 5 yrs 5% 5% 10%
5 to 10 yrs 5.00% 5% 10%
10 to 20 yrs 2% 2%
More than 20 yrs 10% 6% 1% 17%
Total 17.63% 5% 5% 2% 7% 5% 6% 1% 48.63%
The table in my procedure (@Worktable) i've created looks like this:
PortfolioID IssueName SandPRating SandPRatingSort MaturityBand MaturitySort
XXXXX Bond1 AAA 1 Less than 1 yr 1
XXXXX Bond2 AAA 1 Less than 1 yr 1
XXXXX Bond3 AA- 7 5 to 10 yrs 3
XXXXX Bond4 BBB+ 8 1 to 5 yrs 2
etc.......
The SandPRatingSort orders the ratings with 1 being the highest and it is the same for Maturity sort.
My issue is that i'm stuggling to code the table into the format above in my procedure (sorry if this seems easy but i'm relatively new to this). I can group it by MaturityBand but how do i get it into the correct order and also how do i achieve the percentages with the ratings as the headers? The percentages by the way are the number of bonds that have the rating as a percentage of all of the bonds the portfolio holds.
The best i've got so far is this pivot:
SELECT MaturityBand, [AAA],[AA+],[AA],[AA-],[A+],[A],[A-],[BBB+],[BBB],[BBB-],[BB+],[BB],[BB-],[B+],[B],[B-],[CCC+],[CCC],[CCC-],[CC],[C],[DDD],[DD],[D],[N.R.],[N.A.],[WR]
FROM
(
SELECT
MaturityBand
,SandPRating
FROM @Worktable
WHERE SandPRating IS NOT NULL
GROUP BY MaturityBand, SandPRating, MaturitySort, SandPSort
) AS source
PIVOT
(
COUNT(SandPRating)
FOR SandPRating IN ([AAA],[AA+],[AA],[AA-],[A+],[A],[A-],[BBB+],[BBB],[BBB-],[BB+],[BB],[BB-],[B+],[B],[B-],[CCC+],[CCC],[CCC-],[CC],[C],[DDD],[DD],[D],[N.R.],[N.A.],[WR])
) AS pvt
The pivot is not doing exactly what i'm wanting to do. How do i get the percentages? And how can i get grand totals for both the columns and the rows? Also the count in the pivot is only returning 1's, how can i get this to sum the number of ratings for each column without having to do a whole load of re work to the worktable?
A point in the right direction or some guidance would be greatly appreaciated.
Thanks
解決
With a bit of trial and error and some help from other posts i've managed to answer my own question. CTE is very useful and i'm glad i've had this experience to be able to learn the ins and outs of it.
;WITH CTE
AS
(
SELECT PortfolioID
, MaturityBand
, SandPRating
, MaturitySort
, SUM((1/RecNo)*100) AS Pct
FROM @Worktable AS A
--WHERE SandPRating IS NOT NULL
Group by MaturitySort, MaturityBand, SandPRating, PortfolioID
UNION All
SELECT PortfolioID
, MaturityBand
, 'SandPRating_Total' AS SandPRating
, MaturitySort
, COUNT(*) * 100.0
/
(
SELECT COUNT(*)
FROM @Worktable AS B
WHERE B.PortfolioID = A.PortfolioID
) AS Total_Pct
FROM @Worktable AS A
--WHERE SandPRating IS NOT NULL
GROUP BY MaturitySort, MaturityBand, PortfolioID
)
, CTE2
AS
(
SELECT Grouping_ID(SandPRating, MaturityBand, MaturitySort) AS ID
, CASE
WHEN Grouping_ID(SandPRating, MaturityBand, MaturitySort) = 3 THEN 'Total'
ELSE MaturityBand
END AS MaturityBand
, SandPRating
, CASE
WHEN Grouping_ID(SandPRating, MaturityBand, MaturitySort) = 3 THEN 1000
ELSE MaturitySort
END AS MaturitySort
, SUM(Pct) AS PCT
FROM CTE
GROUP BY ROLLUP (SandPRating
, MaturityBand
, MaturitySort)
)
--PIVOT
SELECT MaturityBand, [AAA],[AA+],[AA],[AA-],[A+],[A],[A-]
,[BBB+],[BBB],[BBB-],[BB+],[BB],[BB-],[B+],[B],[B-]
,[CCC+],[CCC],[CCC-],[CC],[C],[DDD],[DD],[D],[N.R.]
,[N.A.],[WR],[Unass],[SandPRating_Total]
FROM ( SELECT SandPRating, MaturityBand, MaturitySort, PCT
FROM Cte2
WHERE ID = 0 or ID = 3
) AS x
PIVOT (SUM(PCT)
FOR SandPRating
IN ([AAA],[AA+],[AA],[AA-],[A+],[A],[A-]
,[BBB+],[BBB],[BBB-],[BB+],[BB],[BB-],[B+],[B],[B-]
,[CCC+],[CCC],[CCC-],[CC],[C],[DDD],[DD],[D],[N.R.]
,[N.A.],[WR],[Unass],[SandPRating_Total])
) myPiv
ORDER BY MaturitySort
他のヒント
The following gets all the totals and calculates percent values for all items:
SELECT *
FROM (
SELECT
MaturitySort = COALESCE(MaturitySort, 2147483647),
MaturityBand = COALESCE(MaturityBand, 'Total'),
SandPRating = COALESCE(SandPRating , 'Total'),
[Percent] =
COUNT(*) * 100.0 /
SUM(CASE WHEN MaturityBand IS NOT NULL AND SandPRating IS NOT NULL THEN COUNT(*) END) OVER ()
FROM WorkTable
GROUP BY
CUBE(
(MaturityBand, MaturitySort),
(SandPRating)
)
) s
PIVOT (
MAX([Percent]) FOR SandPRating IN (
[AAA],[AA+],[AA],[AA-],[A+],[A],[A-],
[BBB+],[BBB],[BBB-],[BB+],[BB],[BB-],[B+],[B],[B-],
[CCC+],[CCC],[CCC-],[CC],[C],[DDD],[DD],[D],[N.R.],
[N.A.],[WR],[Unass],[Total]
)
) p
ORDER BY MaturitySort
As you can see, calculation of the percent values uses an aggregate windowing function (SUM()
). If there was no grouping by CUBE()
, we could just do SUM(COUNT(*)) OVER ()
. But with CUBE()
it would include values that shouldn't be included (the rolled up counts). And one way to account for them is testing SandPRating
and MaturityBand
for NULLs, because NULL is substituted for either or both of the columns when the row contains a rolled up count.
That assumes, of course, that neither of the columns, MaturityBand
in particular, can hold actual NULLs. More proper way to omit roll-ups would perhaps be testing the results of GROUPING(column)
instead: if it returns 0
for both columns, include the COUNT(*)
in the total:
…
[Percent] =
COUNT(*) * 100.0 /
SUM(CASE WHEN GROUPING(MaturityBand) = 0 AND GROUPING(SandPRating) = 0 THEN COUNT(*) END) OVER ()
…