Group by arbitary monthly time period

https://dba.stackexchange.com/questions/245414

07-02-2021
|

Question

I want to group the following data by an user defined period:

+------------+--------+
|    DATE    | Amount |
+------------+--------+
| 2019-03-12 |   300  |
| 2019-03-15 |  1500  |
| 2019-03-25 |  2500  |
| 2019-03-25 |  3000  |
| 2019-04-04 |  5000  |
| 2019-04-27 | 10000  |
+------------+--------+

However, the start and end of the period does not have to align with the start and end of a calendar month, e.g.:

User A: period start first of month, period end last of month

User B: period start 15th of the month, period end 14th of the following month

User C: period start 7th of the month, period end 6th of the following month

So:

User A:

+--------------+--------+
| Period Start | Amount |
+--------------+--------+
|  2019-01-01  |    0   |
|  2019-02-01  |    0   |
|  2019-03-01  |  7300  |
|  2019-04-01  | 15000  | 
|  2019-05-01  |    0   |
|  2019-06-01  |    0   |
|     ...      |   ...  |
+--------------+--------+

User B:

+--------------+--------+
| Period Start | Amount |
+--------------+--------+
|  2019-01-15  |    0   |
|  2019-02-15  |   300  |
|  2019-03-15  | 12000  |
|  2019-04-15  | 10000  | 
|  2019-05-15  |    0   |
|  2019-06-15  |    0   |
|     ...      |   ...  |
+--------------+--------+

User C:

+--------------+--------+
| Period Start | Amount |
+--------------+--------+
|  2019-01-07  |    0   |
|  2019-02-07  |    0   |
|  2019-03-07  | 12300  |
|  2019-04-07  | 10000  | 
|  2019-05-07  |    0   |
|  2019-06-07  |    0   |
|     ...      |   ...  |
+--------------+--------+

Can this be done in a vendor-agnostic way? If not please show me how it can be done in PostgreSQL 10.x and - if possible - in HSQLDB 2.4.x.

Solution

There's a number of different ways to do this, but if you only need it to be a monthly period starting on a given date, I would use a variable for the day of the month to start on and just calculate a new column with that month. Something like:

DECLARE @DayStart TINYINT

WITH cteMonth AS 
(
SELECT DateColumn, 
CASE 
WHEN @DayStart < DAY(DateColumn) THEN CASE WHEN MONTH(DateColumn) = 1 THEN CAST(YEAR(DateColumn) - 1 AS VARCHAR(4)) + '/12' ELSE CAST(YEAR(DateColumn) AS VARCHAR(4)) + '/' + CAST(MONTH(DateColumn) - 1 AS VARCHAR(2)) END
ELSE CAST(YEAR(DateColumn) AS VARCHAR(4)) + '/' + CAST(MONTH(DateColumn) AS VARCHAR(2))
END AS DateMonth
FROM SomeTable
)
SELECT DateMonth, [aggregate columns]
FROM cteMonth
GROUP BY DateMonth

Apologies for this being T-SQL syntax (I'm not a PL/SQL guy), but in case you need help on declaring variables: http://www.postgresqltutorial.com/plpgsql-variables/

OTHER TIPS

Based on the answer by @user3760185 I got it working:

SELECT
    CASE WHEN EXTRACT(DAY FROM t.date) >= 15 THEN
        CAST(EXTRACT(YEAR FROM t.date) AS VARCHAR(4)) || '/' || CAST(EXTRACT(MONTH FROM t.date) AS VARCHAR(2))
    ELSE
        CASE WHEN EXTRACT(MONTH FROM t.date) = 1 THEN
            CAST(EXTRACT(YEAR FROM t.date) - 1 AS VARCHAR(4)) || '/12'
        ELSE
            CAST(EXTRACT(YEAR FROM t.date) AS VARCHAR(4)) || '/' || CAST(EXTRACT(MONTH FROM t.date) - 1 AS VARCHAR(2))
        END
    END
    AS DateMonth,
FROM
    transaction t
GROUP BY
    DateMonth
ORDER BY
    DateMonth ASC;

It somehow worked without the CTE as well.

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange