Question

I need a query for SQl server 2005 (SQL server management studio express). I have data stored as 1 minute time frame (1 minute each row), for each table columns are ID, Symbol, DateTime, Open, High, Low, Close, Volume. I need to convert (compress) to every possibile multiple time frame, so let's say 10 minutes, 13, 15, and so on. Provide full details if somebody could help. Thanks Alberto

Was it helpful?

Solution

;WITH cte AS
(SELECT *,
        (32 * CAST([DATETIME] AS INT)) + DATEPART(HOUR,[DATETIME]) + (DATEPART(MINUTE,[DATETIME])/15)/4.0 AS Seg
     FROM     prices
     )
,cte1 AS
(
SELECT *,
        ROW_NUMBER() OVER (PARTITION BY Symbol,Seg ORDER BY [DATETIME])      AS RN_ASC ,
        ROW_NUMBER() OVER (PARTITION BY Symbol,Seg ORDER BY [DATETIME] DESC) AS RN_DESC
FROM cte
)     
SELECT 
      Symbol,
      Seg,
      MAX(CASE WHEN RN_ASC=1 THEN [DATETIME] END) AS OpenDateTime,
      MAX(CASE WHEN RN_ASC=1 THEN [OPEN] END) AS [OPEN],
      MAX(High) High,
      MIN(Low)  Low,
      SUM(Volume) Volume,
      MAX(CASE WHEN RN_DESC=1 THEN [CLOSE] END) AS [CLOSE],
      MAX(CASE WHEN RN_DESC=1 THEN [DATETIME] END) AS CloseDateTime
FROM cte1
GROUP BY Symbol,Seg
ORDER BY OpenDateTime

Or another approach that may be worth testing to see if it is any faster.

DECLARE @D1 DATETIME
DECLARE @D2 DATETIME
DECLARE @Interval FLOAT

SET @D1  = '2010-10-18 09:00:00.000'
SET @D2  = '2010-10-19 18:00:00.000'
SET @Interval = 15

;WITH 
L0 AS (SELECT 1 AS c UNION ALL SELECT 1),
L1 AS (SELECT 1 AS c FROM L0 A CROSS JOIN L0 B),
L2 AS (SELECT 1 AS c FROM L1 A CROSS JOIN L1 B),
L3 AS (SELECT 1 AS c FROM L2 A CROSS JOIN L2 B),
L4 AS (SELECT 1 AS c FROM L3 A CROSS JOIN L3 B),
Nums AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS i FROM L4),
Ranges AS(
SELECT 
      DATEADD(MINUTE,@Interval*(i-1),@D1) AS StartRange,
      DATEADD(MINUTE,@Interval*i,@D1) AS NextRange
FROM Nums where i <= 1+CEILING(DATEDIFF(MINUTE,@D1,@D2)/@Interval))
,cte AS (
SELECT 
     * 
     ,ROW_NUMBER() OVER (PARTITION BY Symbol,r.StartRange ORDER BY [DateTime])      AS RN_ASC 
     ,ROW_NUMBER() OVER (PARTITION BY Symbol,r.StartRange ORDER BY [DateTime] DESC) AS RN_DESC
FROM Ranges r
JOIN prices p ON p.[DateTime] >= r.StartRange and p.[DateTime] < r.NextRange )
SELECT 
      Symbol,
      MAX(CASE WHEN RN_ASC=1 THEN [DateTime] END) AS OpenDateTime,
      MAX(CASE WHEN RN_ASC=1 THEN [Open] END) AS [Open],
      MAX(High) High,
      MIN(Low)  Low,
      SUM(Volume) Volume,
      MAX(CASE WHEN RN_DESC=1 THEN [Close] END) AS [Close],
      MAX(CASE WHEN RN_DESC=1 THEN [DateTime] END) AS CloseDateTime
FROM cte
GROUP BY Symbol,StartRange
ORDER BY OpenDateTime

OTHER TIPS

Alberto, it looks like you need a "Group By" clause in SQL statements (as Leppie stated). So, you should better look for it.

First you should filter the rows that is subject for aggregation by using begin and end date/time and then group them by the mentioned clause.

Here is the first link when i search "sql group by" keywords via Google.

Not simple "Group By" - Open and Close values need taken for first and correspondingly last row in group. Or at least so is it for Forex data :)

Would be prettier with a stored proc to extract MIN(datetime) first, but here's a sketch:

WITH quarters(q) AS (
    SELECT DISTINCT
        15*CAST(DATEDIFF("n",'2000/01/01',dataora) / 15 as Int) AS primo
    FROM 
        Prezzi
)
SELECT
    simbolo, DATEADD("n",q,'2000/01/01') AS tick, 
        MIN(minimo) AS minimo, MAX(massimo) AS massimo,
        (SELECT 
            TOP 1 apertura FROM Prezzi P 
         WHERE 
            P.simbolo = simbolo AND 
            P.dataora >= DATEADD("n",q,'2000/01/01')
         ORDER BY
            P.dataora ASC
         ) as primaapertura,
        (SELECT 
            TOP 1 chiusura FROM Prezzi P 
         WHERE 
            P.simbolo = simbolo AND 
            P.dataora < DATEADD("s",14*60+59,DATEADD("n",q,'2000/01/01'))
         ORDER BY
            P.dataora DESC
         ) as ultimachiusara,
        SUM(volume) / COUNT(*) AS volumemedio
FROM
    quarters INNER JOIN Prezzi
    ON dataora BETWEEN DATEADD("n",q,'2000/01/01')
        AND DATEADD("s",14*60+59,DATEADD("n",q,'2000/01/01'))
GROUP BY
    simbolo, DATEADD("n",q,'2000/01/01')
ORDER BY 
    1, 2

The WITH clause gets a list of 15 minute intervals, rounded down, in your dataset (let's assume nothing before 2000). Then use those intervals to group by 14:59 interval. For the volume, you'll have to decide if you want average or the total.

The syntax might be a tad off, but you should get the idea.

EDIT: Adjusted MIN(open), MIN(close) to pick up FIRST and LAST. In reality this won't change much, as the concept of Open and Close depend on knowing the time difference between the exchange where the quote originated and the clock of the computer collecting the data.

In addition, unless the OP has the privilege of a real-time feed from all the exchanges, all the quotes are delayed by 20 minutes anyway.

EDIT(2): Quite right, FIRST and LAST are carry-overs from my IBM days >;-)

Solution now selects first and last quotes during the interval using TOP with ASC/DESC.

    Declare @tbl1MinENI Table 
    (ID int identity,
     Simbolo char(3),
     DataOra datetime,
     Apertura numeric(15,4),
     Massimo  numeric(15,4),
     Minimo numeric(15,4),
     Chiusura numeric(15,4),
     Volume int)

    Insert Into  @tbl1MinENI (  Simbolo, DataOra, Apertura, Massimo, Minimo, Chiusura, Volume)
    Values
    ('ENI', '2010/10/18 09:00:00', 16.1100, 16.1800, 16.1100, 16.1400, 244015),
    ('ENI', '2010/10/18 09:01:00', 16.1400, 16.1400, 16.1300, 16.1400, 15692 ),
    ('ENI', '2010/10/18 09:02:00', 16.1400, 16.1500, 16.1400, 16.1500, 147035),
    ('ENI', '2010/10/18 09:03:00', 16.1500, 16.1600, 16.1500, 16.1600, 5181  ),
    ('ENI', '2010/10/18 09:04:00', 16.1600, 16.2000, 16.1600, 16.1900, 5134  ),
    ('ENI', '2010/10/18 09:05:00', 16.1900, 16.1900, 16.1800, 16.1800, 15040 ),
    ('ENI', '2010/10/18 09:06:00', 16.1900, 16.1900, 16.1600, 16.1600, 68867 ),
    ('ENI', '2010/10/18 09:07:00', 16.1600, 16.1600, 16.1600, 16.1600, 7606  ),
    ('ENI', '2010/10/18 09:08:00', 16.1500, 16.1500, 16.1500, 16.1500, 725   ),
    ('ENI', '2010/10/18 09:09:00', 16.1600, 16.1600, 16.1600, 16.1600, 81    ),
    ('ENI', '2010/10/18 09:10:00', 16.1700, 16.1800, 16.1700, 16.1700, 68594 ),
    ('ENI', '2010/10/18 09:11:00', 16.1800, 16.1800, 16.1800, 16.1800, 6619  )

    Declare @nRowsPerGroup int = 3

;With Prepare as
(
Select datediff(minute, '2010/10/18 09:00:00', DataOra)/@nRowsPerGroup as Grp,
       Row_Number() over (partition by datediff(minute, '2010/10/18 09:00:00', DataOra)/@nRowsPerGroup order by dataora) as rn,
       *
  From tbl1MinENI     
), b as
(
Select a.Grp, 
         Min(a.DataOra)          as GroupDataOra, 
         Min(ID) AperturaID,
         max(a.Massimo)          as Massimo, 
         Min(a.Minimo)           as Minimo, 
         max(id) ChiusuraID,
         sum(a.Volume)           as Volume
    From Prepare a
   Group by Grp
)
Select b.grp,
       b.GroupDataOra,
       ta.Apertura,
       b.Massimo,
       b.Minimo,
       tc.Chiusura,
       b.Volume
 From b
 Inner Join tbl1MinENI ta on ta.ID=b.AperturaID
 Inner Join tbl1MinENI tc on tc.ID=b.ChiusuraID
 ;   
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top