Question

I need to find all members in the database with data on 7 or more consecutive days. The table is set up as member_id and data_date. There are duplicate member_ids for each day that there is data.

I found some answers using MySQL that involved datediff or dateadd, but I am not sure how to do this in PostgreSQL 8.2.15. Below is an example of what the table looks like, but with many more rows.

+------------+------------+
| member_id  |  data_date |
+------------+------------+
|    0000001 | 2018-04-10 |
|    0000005 | 2018-04-16 |
|    0000001 | 2018-04-11 |
|    0000002 | 2018-04-12 |
|    0000003 | 2018-04-13 |
|    0000004 | 2018-04-12 |
|    0000005 | 2018-04-15 |
|    0000003 | 2018-04-19 |
|    0000002 | 2018-04-17 |
|    0000001 | 2018-04-18 |
|    0000005 | 2018-04-10 |
|    0000002 | 2018-04-18 |
|    0000001 | 2018-04-08 |
|    0000002 | 2018-04-03 |
|    0000003 | 2018-04-02 |
|    0000004 | 2018-04-14 |
|    0000005 | 2018-04-15 |
|    0000003 | 2018-04-16 |
|    0000002 | 2018-04-19 |
|    0000001 | 2018-04-14 |
+------------+------------+

(member_id, data_date) is defined UNIQUE.

Was it helpful?

Solution

This simple query should work in your outdated version:

SELECT DISTINCT member_id
FROM   tbl t1
JOIN   tbl t2 USING (member_id)
JOIN   tbl t3 USING (member_id)
JOIN   tbl t4 USING (member_id)
JOIN   tbl t5 USING (member_id)
JOIN   tbl t6 USING (member_id)
JOIN   tbl t7 USING (member_id)
WHERE  t2.data_date = t1.data_date + 1
AND    t3.data_date = t1.data_date + 2
AND    t4.data_date = t1.data_date + 3
AND    t5.data_date = t1.data_date + 4
AND    t6.data_date = t1.data_date + 5
AND    t7.data_date = t1.data_date + 6;

In Postgres, you can just add integer to a date to get the next day.

And probably fast, too - as long as you don't have much longer streaks of entries producing many dupes in the first step.

Or:

SELECT DISTINCT member_id
FROM   tbl t1
JOIN   tbl t2 USING (member_id)
WHERE  t2.data_date BETWEEN t1.data_date + 1
                        AND t1.data_date + 6
GROUP  BY t1.member_id, t1.date
HAVING count(*) = 6;

Update to a current version of Postgres at the earliest opportunity.

OTHER TIPS

Your version is very antiquated and insecure. For reference 8.2 was last supported in 2011. It was released in 2006. It lacks

  • tstz ranges
  • window functions
  • lateral joins

Moreover, we can't easily configure your setup to test our queries. Upgrade your version PostgreSQL; if you find that difficult to do, consider seeking out a consultant.

I think what you want is something like

SELECT member_id, t2.date, count(*)
FROM t AS t1
CROSS JOIN t AS t2
  ON ( member_id )
WHERE t1.date_date BETWEEN t2.date AND t2.date_date + '7 days
GROUP BY member_id, t2.date
HAVING count(*) >= 7;

But I can't test it and you didn't provide CREATE TABLE or INSERT statements.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top