Question

I'm having a hard time explaining this through writing, so please be patient.

I'm making this project in which I have to choose a month and a year to know all the active employees during that month of the year.. but in my database I'm storing the dates when they started and when they finished in dd/mm/yyyy format.

So if I have an employee who worked for 4 months eg. from 01/01/2013 to 01/05/2013 I'll have him in four months. I'd need to make him appear 4 tables(one for every active month) with the other employees that are active during those months. In this case those will be: January, February, March and April of 2013.

The problem is I have no idea how to make a query here or php processing to achieve this.

All I can think is something like (I'd run this query for every month, passing the year and month as argument)

pg_query= "SELECT employee_name FROM employees
           WHERE month_and_year between start_date AND finish_date"

But that can't be done, mainly because month_and_year must be a column not a variable.
Ideas anyone?

UPDATE

Yes, I'm very sorry that I forgot to say I was using DATE as data type.

The easiest solution I found was to use EXTRACT

select * from employees where extract (year FROM start_date)>='2013'
AND extract (month FROM start_date)='06' AND extract (month FROM finish_date)<='07'

This gives me all records from june of 2013 you sure can substite the literal variables for any variable of your preference

Was it helpful?

Solution

There is no need to create a range to make an overlap:

select to_char(d, 'YYYY-MM') as "Month", e.name
from
    (
        select generate_series(
            '2013-01-01'::date, '2013-05-01', '1 month'
        )::date
    ) s(d)
    inner join
    employee e on
        date_trunc('month', e.start_date)::date <= s.d
        and coalesce(e.finish_date, 'infinity') > s.d
order by 1, 2

SQL Fiddle

If you want the months with no active employees to show then change the inner for a left join


Erwin, about your comment:

the second expression would have to be coalesce(e.finish_date, 'infinity') >= s.d

Notice the requirement:

So if I have an employee who worked for 4 months eg. from 01/01/2013 to 01/05/2013 I'll have him in four months

From that I understand that the last active day is indeed the previous day from finish.

If I use your "fix" I will include employee f in month 05 from my example. He finished in 2013-05-01:

('f', '2013-04-17', '2013-05-01'),

SQL Fiddle with your fix

OTHER TIPS

Assuming that you really are not storing dates as character strings, but are only outputting them that way, then you can do:

SELECT employee_name
FROM employees
WHERE start_date <= <last date of month> and
      (finish_date >= <first date of month> or finish_date is null)

If you are storing them in this format, then you can do some fiddling with years and months.
This version turns the "dates" into strings of the form "YYYYMM". Just express the month you want like this and you can do the comparison:

select employee_name
from employees e
where right(start_date, 4)||substr(start_date, 4, 2) <= 'YYYYMM' and
      (right(finish_date, 4)||substr(finish_date, 4, 2) >= 'YYYYMM' or finish_date is null)

NOTE: the expression 'YYYYMM' is meant to be the month/year you are looking for.

First, you can generate multiple date intervals easily with generate_series(). To get lower and upper bound add an interval of 1 month to the start:

SELECT g::date                       AS d_lower
    , (g + interval '1 month')::date AS d_upper
FROM  generate_series('2013-01-01'::date, '2013-04-01', '1 month') g;

Produces:

  d_lower   |  d_upper
------------+------------
 2013-01-01 | 2013-02-01
 2013-02-01 | 2013-03-01
 2013-03-01 | 2013-04-01
 2013-04-01 | 2013-05-01

The upper border of the time range is the first of the next month. This is on purpose, since we are going to use the standard SQL OVERLAPS operator further down. Quoting the manual at said location:

Each time period is considered to represent the half-open interval start <= time < end [...]

Next, you use a LEFT [OUTER] JOIN to connect employees to these date ranges:

SELECT to_char(m.d_lower, 'YYYY-MM') AS month_and_year, e.*
FROM  (
   SELECT g::date                       AS d_lower
       , (g + interval '1 month')::date AS d_upper
   FROM   generate_series('2013-01-01'::date, '2013-04-01', '1 month') g
   ) m
LEFT   JOIN employees e ON (m.d_lower, m.d_upper)
                  OVERLAPS (e.start_date, COALESCE(e.finish_date, 'infinity'))
ORDER  BY 1;
  • The LEFT JOIN includes date ranges even if no matching employees are found.

  • Use COALESCE(e.finish_date, 'infinity')) for employees without a finish_date. They are considered to be still employed. Or maybe use current_date in place of infinity.

  • Use to_char() to get a nicely formatted month_and_year value.

  • You can easily select any columns you need from employees. In my example I take all columns with e.*.

  • The 1 in ORDER BY 1 is a positional parameter to simplify the code. Orders by the first column month_and_year.

  • To make this fast, create an multi-column index on these expressions. Like

    CREATE INDEX employees_start_finish_idx
    ON employees (start_date, COALESCE(finish_date, 'infinity') DESC);
    

    Note the descending order on the second index-column.

  • If you should have committed the folly of storing temporal data as string types (text or varchar) with the pattern 'DD/MM/YYYY' instead of date or timestamp or timestamptz, convert the string to date with to_date(). Example:

    SELECT to_date('01/03/2013'::text, 'DD/MM/YYYY')
    

    Change the last line of the query to:

    ...
    OVERLAPS (to_date(e.start_date, 'DD/MM/YYYY')
             ,COALESCE(to_date(e.finish_date, 'DD/MM/YYYY'), 'infinity'))
    

    You can even have a functional index like that. But really, you should use a date or timestamp column.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top