Question

I would like to add median to the following query, the median obviously will for each type of bedroom, unittypeid (already in the group by), I tried working with PERCENTILE_CONT but I couldnt figure out how to make it work.

CREATE TABLE #TempListings
(
    ListingId int,
    Price money,
    UnitTypeId int,
    BedroomsAvailable int
)
INSERT INTO #TempListings VALUES(1, 1000, 1, 1)
INSERT INTO #TempListings VALUES(2, 2000, 1, 1)
INSERT INTO #TempListings VALUES(3, 3000, 1, 1)

INSERT INTO #TempListings VALUES(4, 1000, 1, 2)
INSERT INTO #TempListings VALUES(5, 2000, 1, 2)
INSERT INTO #TempListings VALUES(6, 3000, 1, 2)

INSERT INTO #TempListings VALUES(7, 1000, 2, 1)
INSERT INTO #TempListings VALUES(8, 2000, 2, 1)
INSERT INTO #TempListings VALUES(9, 3000, 2, 1)

INSERT INTO #TempListings VALUES(10, 1000, 2, 2)
INSERT INTO #TempListings VALUES(11, 2000, 2, 2)
INSERT INTO #TempListings VALUES(12, 3000, 2, 2)
  
SELECT BedroomsAvailable, 
    COUNT(listingid) AS Count, 
    MIN(price) AS MinPrice, 
    MAX(price) AS MaxPrice, 
    AVG(price) AS AveragePrice,
    STDEV(price) as StandardDeviation,
    UnitTypeId
FROM #TempListings
GROUP BY BedroomsAvailable, UnitTypeId
ORDER BY UnitTypeId

DROP TABLE #TempListings

I am using sqlserver 2019 if it matters

Was it helpful?

Solution

You wanted inly explained how the syntax works. But as complete Query

CREATE TABLE #TempListings
(
    ListingId int,
    Price money,
    UnitTypeId int,
    BedroomsAvailable int
)
INSERT INTO #TempListings VALUES(1, 1000, 1, 1)
INSERT INTO #TempListings VALUES(2, 2000, 1, 1)
INSERT INTO #TempListings VALUES(3, 3000, 1, 1)

INSERT INTO #TempListings VALUES(4, 1000, 1, 2)
INSERT INTO #TempListings VALUES(5, 2000, 1, 2)
INSERT INTO #TempListings VALUES(6, 3000, 1, 2)

INSERT INTO #TempListings VALUES(7, 1000, 2, 1)
INSERT INTO #TempListings VALUES(8, 2000, 2, 1)
INSERT INTO #TempListings VALUES(9, 3000, 2, 1)

INSERT INTO #TempListings VALUES(10, 1000, 2, 2)
INSERT INTO #TempListings VALUES(11, 2000, 2, 2)
INSERT INTO #TempListings VALUES(12, 3000, 2, 2)

INSERT INTO #TempListings VALUES(13, 5000, 2, 1)
INSERT INTO #TempListings VALUES(14, 6000, 2, 1)
INSERT INTO #TempListings VALUES(15, 7000, 2, 1)

INSERT INTO #TempListings VALUES(16, 8000, 2, 2)
INSERT INTO #TempListings VALUES(17, 9000, 2, 2)
INSERT INTO #TempListings VALUES(18, 10000, 2, 2)
GO
WITH #myselect
 AS (SELECT 
         BedroomsAvailable
         ,ListingId 
         ,UnitTypeId
         ,Price,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY Price)  
         OVER (PARTITION BY UnitTypeId) AS MedianCont
    FROM #TempListings)
SELECT BedroomsAvailable 
    ,COUNT(listingid) AS Count 
    ,MIN(price) AS MinPrice 
    ,MAX(price) AS MaxPrice 
    ,AVG(price) AS AveragePrice
    ,STDEV(price) as StandardDeviation 
    , MIN(MedianCont) MedianCont
    ,UnitTypeId
FROM #myselect
GROUP BY BedroomsAvailable, UnitTypeId
ORDER BY UnitTypeId   
GO
BedroomsAvailable | Count |  MinPrice |   MaxPrice | AveragePrice | StandardDeviation | MedianCont | UnitTypeId
----------------: | ----: | --------: | ---------: | -----------: | ----------------: | ---------: | ---------:
                1 |     3 | 1000.0000 |  3000.0000 |    2000.0000 |              1000 |       2000 |          1
                2 |     3 | 1000.0000 |  3000.0000 |    2000.0000 |              1000 |       2000 |          1
                1 |     6 | 1000.0000 |  7000.0000 |    4000.0000 |  2366.43191323985 |       4000 |          2
                2 |     6 | 1000.0000 | 10000.0000 |    5500.0000 |  3937.00393700591 |       4000 |          2

db<>fiddle here

OTHER TIPS

There's a multitude of ways to calculate the Median in SQL Server. Aaron Bertrand goes through the main ones and compares their performance in What Is The Fastest Way To Calculate The Median?.

If you want to use PERCENTILE_COUNT you can do it this way:

SELECT BedroomsAvailable, UnitTypeId, PERCENTILE_CONT(0.5) AS PriceMedian
WITHIN GROUP (ORDER BY price) OVER (PARTITION BY BedroomsAvailable, UnitTypeId)
FROM #TempListings

You can then join this back to your main query on BedroomsAvailable and UnitTypeId.

E.g:

WITH CTE_PriceMedians AS
(
    SELECT BedroomsAvailable, UnitTypeId, PERCENTILE_CONT(0.5) AS PriceMedian
        WITHIN GROUP (ORDER BY price) OVER (PARTITION BY BedroomsAvailable, UnitTypeId)
    FROM #TempListings
)

SELECT BedroomsAvailable 
    ,COUNT(listingid) AS Count 
    ,MIN(price) AS MinPrice 
    ,MAX(price) AS MaxPrice 
    ,AVG(price) AS AveragePrice
    ,STDEV(price) as StandardDeviation 
    ,MIN(MedianCont) MedianCont
    ,UnitTypeId
    ,PriceMedian
    ,SUM(price)/COUNT(1) AS PriceMean
FROM #TempListings AS TL
INNER JOIN CTE_PriceMedians AS PM
    ON TL.BedroomsAvailable = PM.BedroomsAvailable
    AND TL.UnitTypeId = PM.UnitTypeId
GROUP BY BedroomsAvailable, UnitTypeId
ORDER BY UnitTypeId   
Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top