I have a different dates with the amount of products viewed on a webpage over a 30 day time frame. I am trying to create a exponential decay model in SQL. I am using exponential decay because I want to highlight the latest events over older ones. I not sure how to write this in SQL without getting an error. I have never done this before with this type of model so want to make sure I am doing it correctly too.

================================= Data looks like this

product     views   date

  a            1     2014-05-15
  a            2     2014-05-01
  b            2     2014-05-10
  c            4     2014-05-02
  c            1     2014-05-12
  d            3     2014-05-11

================================

Code:

create table decay model as
select product,views,date
case when......
from table abc
group by product;

not sure what to write to do the model

I want to penalize products that were viewed that were older vs products that were viewed more recently

Thank you for your help

有帮助吗?

解决方案

You can do it like this:

  1. Choose the partition in which you want to apply exponential decay, then order descending by date within such a group.

  2. use the function ROW_NUMBER() with ascendent ordering to get the row numbering within each subgroup.

  3. calculate pow(your_variable_in_[0,1], rownum) and apply it to your result.

Code might look like this (might work in Oracle SQL or db2):

SELECT <your_partitioning>, date, <whatever>*power(<your_variable>,rownum-1)
FROM (SELECT a.*
           , ROW_NUMBER() OVER (PARTITION BY <your_partitioning> ORDER BY a.date DESC) AS rownum
      FROM YOUR_TABLE a)
ORDER BY <your_partitioning>, date DESC

EDIT: I read again over your problem and think I understood now what you asked for, so here is a solution which might work (decay factor is 0.9 here):

SELECT product, sum(adjusted_views) // (i)
FROM (SELECT product, views*power(0.9, rownum-1) AS adjusted_views, date, rownum // (ii)
      FROM (SELECT product, views, date // (iii)
                 , ROW_NUMBER() OVER (PARTITION BY product ORDER BY a.date DESC) AS rownum
            FROM YOUR_TABLE a)
      ORDER BY product, date DESC)
GROUP BY product

The inner select statement (iii) creates a temporary table that might look like this

product      views      date          rownum
--------------------------------------------------
  a            1     2014-05-15         1
  a            2     2014-05-14         2
  a            2     2014-05-13         3
  b            2     2014-05-10         1
  b            3     2014-05-09         2
  b            2     2014-05-08         3
  b            1     2014-05-07         4

The next query (ii) then uses the rownumber to construct an exponentially decaying factor 0.9^(rownum-1) and applies it to views. The result is

product     adjusted_views     date          rownum
--------------------------------------------------
  a            1 * 0.9^0     2014-05-15         1
  a            2 * 0.9^1     2014-05-14         2
  a            2 * 0.9^2     2014-05-13         3
  b            2 * 0.9^0     2014-05-10         1
  b            3 * 0.9^1     2014-05-09         2
  b            2 * 0.9^2     2014-05-08         3
  b            1 * 0.9^3     2014-05-07         4

In a last step (the outer query) the adjusted views are summed up, as this seems to be the quantity you are interested in.

Note, however, that in order to be consistent there should be regular distances between the dates, e.g., always on day (--not one day here and a month there, because these will be weighted in a similar fashion although they shouldn't).

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top