Question

I have a postgres table that looks something like this:

proposal_id | nih_budget_start | nih_budget_end  | nsf_start_date | nsf_end_date   | award_amount
proposal_A  | 03/01/2000       | 12/31/2000      |                |                | 10,000
proposal_B  |                  |                 | 08/01/2005     | 07/31/2009     | 5,000,000
proposal_C  | 06/27/2012       | 11/17/2013      |                |                | 420,000

The dates have the date data type.

I'd like to create a view that tells me each year the proposal was funded, and what the average award amount was. So, the view might look something like this (option 1):

proposal_id | start_year | end_year | average_award
proposal_A  | 2000       | 2000     | 10,000
proposal_B  | 2005       | 2009     | 1,000,000 
proposal_C  | 2012       | 2013     | 210,000

Or -- even better -- this (option 2):

proposal_id | year | award
proposal_A  | 2000 | 10,000
proposal_B  | 2005 | 1,000,000
proposal_B  | 2006 | 1,000,000
proposal_B  | 2007 | 1,000,000
proposal_B  | 2008 | 1,000,000
proposal_B  | 2009 | 1,000,000
proposal_C  | 2012 | 210,000
proposal_C  | 2023 | 210,000

Also, it might be nice to have the award amount prorated for partial-year funding, but this isn't completely necessary.

Based on an answer suggested below, I'm currently doing this, which seems to be working as expected to get option 1 above:

CREATE VIEW award_per_year AS
select t1.proposal_id,t1.START_DATE,t1.END_DATE,
(t1.adjusted_award_amount/((t1.END_DATE - t1.START_DATE) + 1.)) avg_award
from
(select t2.proposal_id,
(extract(year from START_DATE)) START_DATE,
(extract(year from END_DATE)) END_DATE,
t2.adjusted_award_amount from
(select proposal_id,
case when nih_budget_start is not NULL then nih_budget_start else nsf_start_date end start_date,
case when nih_budget_end is not NULL then nih_budget_end else nsf_end_date end end_date,
adjusted_award_amount from proposal)t2)t1
Was it helpful?

Solution

Option 1: use COALESCE

SELECT proposal_id, start_year, end_year
     , award_amount/((end_year - start_year) + 1.0) AS avg_award
FROM  (
   SELECT proposal_id
        , extract(year FROM COALESCE(nih_budget_start, nsf_start_date))::int AS start_year
        , extract(year FROM COALESCE(nih_budget_end, nsf_end_date))::int AS end_year
        , award_amount
   FROM   proposal
   ) sub;

Option 2: use generate_series()

SELECT proposal_id
     , generate_series(start_year, end_year) AS year
     , award_amount/((end_year - start_year) + 1.0) AS avg_award
FROM  (
   SELECT proposal_id
        , extract(year FROM COALESCE(nih_budget_start, nsf_start_date))::int AS start_year
        , extract(year FROM COALESCE(nih_budget_end, nsf_end_date))::int AS end_year
        , award_amount
   FROM   proposal
   ) sub;

-> SQLfiddle

OTHER TIPS

Try this query:

select t1.proposal_ID,t1.START_DATE,t1.END_DATE,
DIV(t1.award,(t1.END_DATE - t1.START_DATE + 1))award
from
(select t2.proposal_ID,
(substr(t2.START_DATE,7,4)::integer) START_DATE,
(substr(t2.END_DATE,7,4)::integer) END_DATE,
t2.award from
(select proposal_ID,
case when length(start_date1) = 10 then start_date1 else start_date2 end start_date,
case when length(end_date1) = 10 then end_date1 else end_date2 end end_date,
award from table1)t2)t1;

SQL Fiddle

From your clarification it seems it would be better for you to get a list like below:

proposal_ID | years_funded  | average_award
proposal_A  | 2000          | 10,000
proposal_B  | 2005          | 1,000,000
proposal_B  | 2006          | 1,000,000
proposal_B  | 2007          | 1,000,000
proposal_B  | 2008          | 1,000,000
proposal_B  | 2009          | 1,000,000
proposal_C  | 2012          | 210,000
proposal_C  | 2013          | 210,000

On the front end you can then use this list to display year wise funding of proposal. Please confirm.

Based on your input here is a query which can achieve first output result set you want:

SELECT proposal_id,
     TO_CHAR(COALESCE(nih_budget_start, nsf_start_date),'YYYY') AS start_year,
     TO_CHAR(COALESCE(nih_budget_end, nsf_end_date),'YYYY') AS end_year,
     award_amount/(TO_CHAR(COALESCE(nih_budget_end, nsf_end_date),'YYYY')::INT - TO_CHAR(COALESCE(nih_budget_start, nsf_start_date),'YYYY')::INT+1) AS average_award
FROM Proposals

Following query can achieve the second result set you need using recursive CTE:

WITH RECURSIVE dates AS
( 
     SELECT proposal_id,nih_budget_start,nsf_start_date,nih_budget_end,nsf_end_date, TO_CHAR(COALESCE(nih_budget_start, nsf_start_date),'YYYY')::INT  AS Dt,
     award_amount/(TO_CHAR(COALESCE(nih_budget_end, nsf_end_date),'YYYY')::INT - TO_CHAR(COALESCE(nih_budget_start, nsf_start_date),'YYYY')::INT+1) AS average_award
     FROM proposals
     UNION ALL
     SELECT proposal_id,nih_budget_start,nsf_start_date,nih_budget_end,nsf_end_date, d1.dt + 1, average_award FROM dates d1
     WHERE d1.dt < TO_CHAR(COALESCE(nih_budget_end, nsf_end_date),'YYYY')::INT 
 )
SELECT proposal_id, dt AS year, average_award  FROM dates d ORDER BY proposal_id,dt

See the code at SQLFiddle

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top