Question

I have two sub-queries that create these tables:

 date      | name | data x
-----------+------+-------
2013-07-01 | a    |   2
2013-07-01 | c    |   3
2013-07-01 | d    |   1

 date      | name | data y
-----------+------+-------
2013-07-01 | a    |   13
2013-07-01 | b    |   16
2013-07-01 | d    |   20

I want to do a full join using both date and name as the join criteria. (Date is not limited to 2013-07-01, so really the date field and name field combined make an unique pseudo identifier field.)

Ideally the result should look something like:

 date      | name | data x | data y
-----------+------+--------+-------
2013-07-01 | a    |   2    |  13
2013-07-01 | b    |        |  16
2013-07-01 | c    |   3    |    
2013-07-01 | d    |   1    |  20

(Best if I can put in zeros for the null but that's can be dealt with later.)

I used a query similar to this:

select 
table1.date, table1.name, table1.dataX, table2.dataY
from table1
full join table2 on table1.date=table2.date and table1.name=table2.name

Postgres is only bringing in fields that exist in both tables (so just the rows with names a and c in this example), which really defeats the point of a full join.

I tried different way to troubleshoot, the only one that kind of worked so far is:

select 
table1.date, table2.date, table1.name, table2.name, table1.dataX, table2.dataY
from table1
full join table2 on table1.date=table2.date and table1.name=table2.name

returns:

 date      |date        | name | name | data x | data y
-----------+------------+------+------+--------+-------
2013-07-01 | 2013-07-01 |  a   |   a  |   2    |  13
           | 2013-07-01 |      |   b  |        |  16
2013-07-01 |            |  c   |      |   3    |    
2013-07-01 | 2013-07-01 |  d   |   d  |   1    |  20

There are workarounds to make this work when I use the data but really is not ideal. Any way to make the query return the desired result?

Losing quite some hair here. Thanks a bunch!

Was it helpful?

Solution

Don't use a WHERE clause, but a JOIN condition. The USING clause comes in handy:

SELECT the_date, name, t1.data_x, t2.data_y
FROM   tbl1 t1
FULL   JOIN tbl2 t2 USING (the_date, name);

To prove a point, it could be done without USING:

SELECT COALESCE(t1.the_date, t2.the_date) AS the_date
      ,COALESCE(t1.name, t2.name) AS name
      ,t1.data_x, t2.data_y
FROM   tbl1 t1
FULL   JOIN tbl2 t2 ON t1.the_date = t2.the_date
                   AND t1.name = t2.name

Might be useful for related queries. The first one is more elegant, slightly faster and also standard SQL.

-> SQLfiddle (demonstrating both)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top