Question

I am trying to group dates within a 1 year interval given an identifier by labeling which is the earliest date and which is the latest date. If there are no dates within a 1 year interval from that date, then it will record it's own date as the first and last date. For example originally the data is:

id | date 
____________
a  | 1/1/2000
a  | 1/2/2001
a  | 1/6/2000
b  | 1/3/2001
b  | 1/3/2000
b  | 1/3/1999
c  | 1/1/2000
c  | 1/1/2002
c  | 1/1/2003

And the output I want is:

id  | first_date | last_date
___________________________
a   | 1/1/2000   | 1/2/2001
b   | 1/3/1999   | 1/3/2001
c   | 1/1/2000   | 1/1/2000
c   | 1/1/2002   | 1/1/2003

I have been trying to figure this out the whole day and can't figure it out. I can do it for cases id's with only 2 duplicates, but can't for greater values. Any help would be great.

Was it helpful?

Solution

SELECT id
     , min(min_date) AS min_date
     , max(max_date) AS max_date
     , sum(row_ct)   AS row_ct
FROM  (
   SELECT id, year, min_date, max_date, row_ct
        , year - row_number() OVER (PARTITION BY id ORDER BY year) AS grp
   FROM  (
      SELECT id
           , extract(year FROM the_date)::int AS year
           , min(the_date) AS min_date
           , max(the_date) AS max_date
           , count(*)      AS row_ct
      FROM   tbl
      GROUP  BY id, year
      ) sub1
   ) sub2
GROUP  BY id, grp
ORDER  BY id, grp;

1) Group all rows per (id, year), in subquery sub1. Record min and max of the date. I added a count of rows (row_ct) for demonstration.

2) Subtract the row_number() from the year in the second subquery sub2. Thus, all rows in succession end up in the same group (grp). A gap in the years starts a new group.

3) In the final SELECT, group a second time, this time by (id, grp) and record min, max and row count again. Voilá. Produces exactly the result you are looking for.

-> SQLfiddle demo.

Related answers:
Return array of years as year ranges
Group by repeating attribute

OTHER TIPS

select id, min ([date]) first_date, max([date]) last_date
from <yourTbl> group by id

Use this (SQLFiddle Demo):

SELECT id,
    min(date) AS first_date,
    max(date) AS last_date
FROM mytable
GROUP BY 1
ORDER BY 1
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top