Question

I have a table where I count the used licenses every minute. A cron task checks a session panel and counts the number of session every minute. It fills a session_counts table.

Session_counts:

CREATE TABLE [dbo].[session_counts](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [license_count] [int] NOT NULL,
    [created_at] [datetime] NOT NULL,
    [updated_at] [datetime] NOT NULL,
PRIMARY KEY CLUSTERED 
(
    [id] ASC
)

Example values (I have 525600 rows for a year in my real data table):

id  license_count   created_at      updated_at
1   879     2014-12-30 15:58:42.000 2014-12-30 16:10:50.000
2   55      2014-05-28 12:20:00.000 2014-05-28 12:26:45.000
3   40      2014-03-15 15:51:26.000 2014-03-15 16:02:29.000
4   979     2014-04-25 09:09:49.000 2014-04-25 09:24:03.000
5   614     2014-12-27 11:17:53.000 2014-12-27 11:19:36.000
6   721     2014-09-02 15:20:40.000 2014-09-02 15:32:47.000
7   625     2014-02-27 11:17:42.000 2014-02-27 11:23:26.000
8   1576        2014-12-30 17:18:32.000 2014-12-30 17:36:55.000
9   893     2014-07-26 17:32:47.000 2014-07-26 17:34:22.000
10  897     2014-07-19 11:26:32.000 2014-07-19 11:42:26.000

Now I want to create statistics between a time range like this:

SELECT 
    MIN(license_count) min_lic
  , MAX(license_count) max_lic
  , AVG(license_count) avg_lic
  , DATEPART(month, created_at) monat
FROM session_counts
WHERE created_at BETWEEN '01.01.2014' AND '31.12.2014'
GROUP BY DATEPART(month, created_at)
ORDER BY monat DESC

This gives me MIN, AVG, MAX usages.

Instead of AVG I want to get the median value of the license usage.

My intention is to cut off the min/max peaks in a year to get a more real usage statistic. If necessary I can setup a min/max gap value, if that helps.

I checked this link Function to Calculate Median in Sql Server before, but I wasn't able to transform this to my need.

I need to make this run on SQL Server 2005, 2008 and 2012, so I can't use PERCENTILE_CONT() from 2012.

If someone want to create a table and fill with lots of data I can provide this small TSQL to fill data:

DECLARE @numRows int,@i int, @date datetime
SET @numRows = 100000
SET @i=1

WHILE @i<@numRows
BEGIN
  SET @date = DATEADD(second, RAND()*36000, DATEADD(day, RAND()*365, '2014-01-01 08:00:00'))
    INSERT session_counts (license_count, created_at, updated_at)
  SELECT CAST(RAND()*1000 AS int) license_count
   , @date created_at
   , DATEADD(second, RAND()*1440, @date) updated_at
    SET @i=@i+1
END

Hope anybody has a proper solution for my problem.

Was it helpful?

Solution

Just tried to implement method described here: http://www.sqlperformance.com/2012/08/t-sql-queries/median

SELECT MIN(license_count) min_lic
    , MAX(license_count) max_lic
    , AVG(license_count) avg_lic
    , SUM(CASE WHEN ra BETWEEN rd - 1 AND rd + 1 THEN license_count ELSE 0 END) * 1.0
    / SUM(CASE WHEN ra BETWEEN rd - 1 AND rd + 1 THEN 1 ELSE 0 END) as median_lic
    , DATEPART(month, created_at) monat
FROM (
    SELECT created_at
        , license_count
        , ROW_NUMBER() OVER (PARTITION BY DATEPART(month, created_at) ORDER BY license_count, created_at) as ra
        , ROW_NUMBER() OVER (PARTITION BY DATEPART(month, created_at) ORDER BY license_count DESC, created_at DESC) as rd
    FROM session_counts
    WHERE created_at BETWEEN '01.01.2014' AND '31.12.2014'
) T
GROUP BY DATEPART(month, created_at)
ORDER BY monat DESC
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top