Question

I am wondering how to leverage postgresql's timestamp data type to calculate some statistics. For example, say I want to calculate the profit/loss after each transaction so I have a three column table tmp:

create table tmp
(date     timestamp,
 customer text,
 price    numeric)

Sample view:

date                 | customer | price 
2014-03-17 18:23:51  | buyer    | 100
2014-03-17 18:14:24  | buyer    | 101   
2014-03-17 18:09:14  | seller   | 102
2014-03-17 18:03:52  | buyer    | 103
2014-03-17 17:57:51  | seller   | 104
2014-03-17 17:52:43  | seller   | 105
2014-03-17 17:52:36  | buyer    | 106
2014-03-17 17:52:35  | seller   | 107
2014-03-17 17:52:35  | buyer    | 108

Now the goal is to find the spread between the price paid by the buyer and the price paid by the seller based on the closest transaction of the opposite type (buy or sell) either before or after the record in question.

For example, looking at transaction 5 from the top:

2014-03-17 17:57:51  | seller   | 104 

the record where customer = 'buyer' immediately proceeding it is 00:06:01 after and the record where customer = 'buyer' preceding it is 00:05:15 before so here we would want to difference the price 104 - 106 = 2. I've searched around and haven't been able to find any hints. Thanks.

edit: in the above example it uses the price before but I am trying to find a generalized solution that will decided which nearest price, before or after, to use in a calculation.

Était-ce utile?

La solution

SQL Fiddle

select distinct on (t1.date, t1.customer)
    t1.date, t1.customer, t1.price,
    t1.price - t2.price as price_diff,
    abs(extract(epoch from (t1.date - t2.date))) as seconds_diff 
from
    tmp t1
    inner join
    tmp t2 on t1.customer != t2.customer
order by t1.date desc, t1.customer, seconds_diff

Autres conseils

Your example is the preceding price. You can do this using window/analytic functions.

Here is the idea. Add up the number of "buyers" before a given row. This divides the rows into groups that are identified by having the same sum. Do the same for sellers. Now, within each group, spread the appropriate price value to get the most recent price for each possibility:

select t.*,
       max(case when customer = 'buyer' then price end) over (partition by buyer_grp) as prev_buyer_price,
       max(case when customer = 'buyer' then price end) over (partition by buyer_grp) as prev_seller_price
from (select tmp.*,
              as buyer_price,
             (case when customer = 'seller' then price end) as seller_price,
             sum(case when customer = 'buyer' then 1 else 0 end) over
                 (order by date) as buyer_grp,
             sum(case when customer = 'seller' then 1 else 0 end) over
                 (order by date) as seller_grp,
      from tmp
     ) t

Your difference is then just a case statement on top of this:

select (case when customer = 'buyer' then price - prev_seller_price
             when customer = 'seller' then price - prev_buyer_price
        end) as diff

You can do the same thing in revere order for the "next" price. However, your example only uses the previous price (which makes sense to me), although the text mentions both directions.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top