Question

I have a table with a list of dates where an employee became Active/Inactive and I want to count the weeks that an employee was Active within a certain date range.

So the table (ps_job) would have values like this:

EMPLID     EFFDT       HR_STATUS
------     -----       ------
1000       01-Jul-11   A
1000       01-Sep-11   I
1000       01-Jan-12   A
1000       01-Mar-12   I
1000       01-Sep-12   A

The query would need to show me the number of weeks that this emplid was active from 01-Jul-11 to 31-Dec-12.

The desired result set would be:

EMPLID     WEEKS_ACTIVE
------     ------------
1000       35

I got the number 35 by adding the results from the SQLs below:

SELECT (NEXT_DAY('01-Sep-11','SUNDAY') - NEXT_DAY('01-Jul-11','SUNDAY'))/7 WEEKS_ACTIVE FROM DUAL;
SELECT (NEXT_DAY('01-Mar-12','SUNDAY') - NEXT_DAY('01-Jan-12','SUNDAY'))/7 WEEKS_ACTIVE FROM DUAL;
SELECT (NEXT_DAY('31-Dec-12','SUNDAY') - NEXT_DAY('01-Sep-12','SUNDAY'))/7 WEEKS_ACTIVE FROM DUAL;

The problem is I can't seem to figure out how to create a single query statement that will go through all the rows for every employee within a certain date range and just return each emplid and the number of weeks they were active. I would prefer to use basic SQL instead of PL/SQL so that I can transfer it to a PeopleSoft query that can be run by the user, but I am willing to run it for the user using Oracle SQL Developer if need be.

Database: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

Was it helpful?

Solution

Here I'm using lead in a subquery to get the next date and then summing the intervals in the outer query:

with q as (
    select EMPLID, EFFDT, HR_STATUS
        , lead (EFFDT, 1) over (partition by EMPLID order by EFFDT) as NEXT_EFFDT
    from ps_job
    order by EMPLID, EFFDT
)
select EMPLID
    , trunc(sum((trunc(coalesce(NEXT_EFFDT, current_timestamp)) - trunc(EFFDT)) / 7)) as WEEKS_ACTIVE
from q
where HR_STATUS = 'A'
group by EMPLID;

The coalesce function will grab the system date in the event it cannot find a matching I record (employee is current). You could substitute the end of the year if that's your spec.

Note that I'm not doing any rigorous testing to see that your entries are ordered A/I/A/I etc., so you might want to add checks of that nature if you know your data requires it.

Feel free to play with this at SQL Fiddle.

OTHER TIPS

If the customer just wants a rough estimate I'd start with the number of days for each stint, divided by 7 and rounded.

The trick is to line up the Active date with its corresponding Inactive date, and the best way I can think to do this is to pick out the Active and Inactive dates separately, rank them by date, and join them back together by EmplID and rank. The ROW_NUMBER() analytical function is the best way to rank in this situation:

WITH
  EmpActive AS (
    SELECT
        EmplID,
        EffDt,
        ROW_NUMBER() OVER (PARTITION BY EmplID ORDER BY EffDt NULLS LAST) DtRank
      FROM ps_job
      WHERE HR_Status = 'A'
  ),
  EmpInactive AS (
   SELECT
      EmplID,
      EffDt,
      ROW_NUMBER() OVER (PARTITION BY EmplID ORDER BY EffDt NULLS LAST) DtRank
    FROM ps_job
    WHERE HR_Status = 'I'
  )
SELECT
  EmpActive.EmplID,
  EmpActive.EffDt AS ActiveDate,
  EmpInactive.EffDt AS InactiveDate,
  ROUND((NVL(EmpInactive.EffDt, TRUNC(SYSDATE)) - EmpActive.EffDt) / 7) AS WeeksActive
FROM EmpActive
LEFT JOIN EmpInactive ON
    EmpActive.EmplID = EmpInactive.EmplID AND
    EmpActive.DtRank = EmpInactive.DtRank

The third gig for EmplID = 1000 has an active date but no inactive date, hence the NULLS LAST in the ROW_NUMBER ordering and the left join between the two subqueries.

I've used the "days / 7" math here; you can substitute what you need when you hear back from the customer. Note that if there isn't a corresponding inactive date the query uses the current date.

There's a SQLFiddle of this here.

The following should work for what you are trying to do. I did have to hard code the end date in the NVL statement

SELECT emplid,
       hr_status,
       ROUND(SUM(end_date - start_date)/7) num_weeks
  FROM (SELECT emplid,
               hr_status,
               effdt start_date,
               NVL(LEAD(effdt) OVER (PARTITION BY emplid ORDER BY effdt), 
                                        TO_DATE('12312012','MMDDYYYY')) end_date
          FROM ps_job
       )
 WHERE hr_status = 'A'
 GROUP BY emplid,
          hr_status
 ORDER BY emplid

The inner query will pull the employee and HR status info from the table and use the effdt column as the start date and use the LEAD analytic function to get the next effdt date value from the table, which indicates the start of the next status and so would be the end_date of the current line. If the LEAD function returns NULL, we assign it the finish date (12/31/2012) that you were wanting. he out statement then just limits the result set to the records with the active HR status and calculates the weeks.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top