首先,我上运行DB2i5/OS V5R4.我ROW_NUMBER()、秩()并共同表表达式。我这样做 有顶n%或限制抵消。

实际的数据集我的工作是很难解释,因此,让我们只是说我有一个天气历史表,其中列 (city, temperature, timestamp).我要比较中位数为平均数为每个组 (city).

这是最清洁的办法,我发现,获得的中值为一个整体的表聚集。我适合从IBM红皮书 在这里,:

WITH base_t AS
( SELECT temp, row_number() over (order by temperature) AS rownum FROM t ),
count_t AS
( SELECT COUNT(temperature) + 1 AS base_count FROM base_t ),
median_t AS
( SELECT temperature FROM base_t, count_t
  WHERE rownum in (FLOOR(base_count/2e0), CEILING(base_count/2e0)) )
SELECT DECIMAL(AVG(temperature),10,2) AS median FROM median_t

运作良好,得到一个单一的行回,但它似乎崩溃进行分组。从概念上讲,这是什么我想要的:


SELECT city, AVG(temperature), MEDIAN(temperature) FROM ...

city           | mean_temp       | median_temp       
===================================================
'Minneapolis'  | 60              | 64
'Milwaukee'    | 65              | 66
'Muskegon'     | 70              | 61

可能有一个答复,这让我看起来很愚蠢,但我有心理障碍,这不是我的#1的事工作的权利。看来似乎可能是可能的,但是我不能使用的极其复杂,因为这是个大表和我希望能够定制的列正在进行汇总。

有帮助吗?

解决方案

在SQL服务器,agreagate功能等计数(*)可分配和计算没有一个组。我看着迅速通过引用红皮书,并且它看起来像DB2具有相同的功能。但是,如果没有,则这不会的工作:

create table TemperatureHistory 
    (City varchar(20)
    , Temperature decimal(5, 2)
    , DateTaken datetime)

insert into TemperatureHistory values ('Minneapolis', 61, '20090101')
insert into TemperatureHistory values ('Minneapolis', 59, '20090102')

insert into TemperatureHistory values ('Milwaukee', 65, '20090101')
insert into TemperatureHistory values ('Milwaukee', 65, '20090102')
insert into TemperatureHistory values ('Milwaukee', 100, '20090103')

insert into TemperatureHistory values ('Muskegon', 80, '20090101')
insert into TemperatureHistory values ('Muskegon', 70, '20090102')
insert into TemperatureHistory values ('Muskegon', 70, '20090103')
insert into TemperatureHistory values ('Muskegon', 20, '20090104')

; with base_t as
    (select city
        , Temperature
        , row_number() over (partition by city order by temperature) as RowNum
        , (count(*) over (partition by city)) + 1 as CountPlusOne 
    from TemperatureHistory)
select City
    , avg(Temperature) as MeanTemp
    , avg(case 
        when RowNum in (FLOOR(CountPlusOne/2.0), CEILING(CountPlusOne/2.0)) 
            then Temperature
            else null end) as MedianTemp
from base_t 
group by City
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top