Question

I know how left join working for two tables, but how does it work for three (or more ) tables ?

SELECT col FROM table1 t1 
     LEFT JOIN table2 t2 ON t1.col = t2.col
     LEFT JOIN table3 t3 ON t1.col = t3.col
     LEFT JOIN table4 t4 ON t1.col = t4.col
     -- ...
     WHERE ...

How to interpret this statement ?

Update:

What results do you get if you make triple left join ? Consequence of applying of left joins ?

Was it helpful?

Solution

There are different ways you can configure LEFT JOIN when you have 4 tables.

Query 1

In the question you have (more or less):

SELECT t1.col, t2.info2, t3.info3, t4.info4
  FROM      table1 t1 
  LEFT JOIN table2 t2 ON t1.col = t2.col
  LEFT JOIN table3 t3 ON t1.col = t3.col
  LEFT JOIN table4 t4 ON t1.col = t4.col
 WHERE ...

Note that I've made sure that each table is needed by selecting a value from each table. If there was no column selected from Table4, for example, then the query optimizer would probably spot that and deduce that it does not need to even look at Table4.

The query is a bit like a snowflake; Table1 is (left) joined in turn Table2, Table3 and Table4. All the rows in Table1 that match the criteria in the WHERE clause will appear in the output (at least once). If there are any rows in Table2 that match the ON t1.col = t2.col condition, they will be selected; if there are no such rows, a row of NULLs will be used instead. Suppose that there are 2 rows in Table2 that match R1 from Table1, and 3 rows in Table3 that match R1 from Table1, and 4 rows in Table4 that match R1. Then there will be 24 rows in the output for R1 (unless the WHERE clause eliminates some of those).

Query 2

SELECT t1.col, t2.info2, t3.info3, t4.info4
  FROM      table1 t1 
  LEFT JOIN table2 t2 ON t1.col = t2.col
  LEFT JOIN table3 t3 ON t2.col = t3.col
  LEFT JOIN table4 t4 ON t3.col = t4.col
 WHERE ...

This is a different query altogether. It is a long chain: Table1 is joined to Table2; Table2 is joined to Table3; and Table3 is joined to Table4.

Clearly, you can have other ways of joining the tables too if you wish.

Sample data

Consider this data. The syntax is correct for Informix, but if the TEMP notation for creating a temporary table doesn't work for your DBMS, drop the keyword TEMP and it should be OK.

CREATE TEMP TABLE table1 (col CHAR(2) NOT NULL PRIMARY KEY, info1 CHAR(15) NOT NULL);
INSERT INTO table1 VALUES('R1', 'Info T1 R1');
INSERT INTO table1 VALUES('R2', 'Info T1 R2');
INSERT INTO table1 VALUES('R3', 'Info T1 R3');
INSERT INTO table1 VALUES('R4', 'Info T1 R4');
INSERT INTO table1 VALUES('R5', 'Info T1 R5');
INSERT INTO table1 VALUES('R6', 'Info T1 R6');
INSERT INTO table1 VALUES('R7', 'Info T1 R7');
INSERT INTO table1 VALUES('R8', 'Info T1 R8');
CREATE TEMP TABLE table2 (col CHAR(2) NOT NULL, info2 CHAR(15) NOT NULL, sub2 INTEGER, PRIMARY KEY(col, sub2));
INSERT INTO table2 VALUES('R1', 'Info T2 R1 V1', 1); 
INSERT INTO table2 VALUES('R1', 'Info T2 R1 V2', 2); 
INSERT INTO table2 VALUES('R2', 'Info T2 R2 V1', 1); 
INSERT INTO table2 VALUES('R5', 'Info T2 R5 V1', 1); 
INSERT INTO table2 VALUES('R6', 'Info T2 R6 V1', 1); 
INSERT INTO table2 VALUES('R7', 'Info T2 R7 V1', 1); 
CREATE TEMP TABLE table3 (col CHAR(2) NOT NULL, info3 CHAR(15) NOT NULL, sub3 INTEGER, PRIMARY KEY(col, sub3));
INSERT INTO table3 VALUES('R1', 'Info T3 R1 V1', 11);
INSERT INTO table3 VALUES('R1', 'Info T3 R1 V2', 12);
INSERT INTO table3 VALUES('R1', 'Info T3 R1 V3', 13);
INSERT INTO table3 VALUES('R3', 'Info T3 R3 V1', 11);
INSERT INTO table3 VALUES('R5', 'Info T3 R5 V1', 11);
INSERT INTO table3 VALUES('R6', 'Info T3 R6 V1', 11);
CREATE TEMP TABLE table4 (col CHAR(2) NOT NULL, info4 CHAR(15) NOT NULL, sub4 INTEGER, PRIMARY KEY(col, sub4));
INSERT INTO table4 VALUES('R1', 'Info T4 R1 V1', 21);
INSERT INTO table4 VALUES('R1', 'Info T4 R1 V2', 22);
INSERT INTO table4 VALUES('R1', 'Info T4 R1 V3', 23);
INSERT INTO table4 VALUES('R1', 'Info T4 R1 V4', 24);
INSERT INTO table4 VALUES('R4', 'Info T4 R4 V1', 21);
INSERT INTO table4 VALUES('R5', 'Info T4 R5 V1', 21);
INSERT INTO table4 VALUES('R5', 'Info T4 R5 V2', 22);

The output from the two queries is different. The WHERE clause was omitted altogether.

Output from Query 1

R1  Info T2 R1 V1   Info T3 R1 V1   Info T4 R1 V1
R1  Info T2 R1 V1   Info T3 R1 V1   Info T4 R1 V2
R1  Info T2 R1 V1   Info T3 R1 V1   Info T4 R1 V3
R1  Info T2 R1 V1   Info T3 R1 V1   Info T4 R1 V4
R1  Info T2 R1 V2   Info T3 R1 V1   Info T4 R1 V1
R1  Info T2 R1 V2   Info T3 R1 V1   Info T4 R1 V2
R1  Info T2 R1 V2   Info T3 R1 V1   Info T4 R1 V3
R1  Info T2 R1 V2   Info T3 R1 V1   Info T4 R1 V4
R1  Info T2 R1 V1   Info T3 R1 V2   Info T4 R1 V1
R1  Info T2 R1 V1   Info T3 R1 V2   Info T4 R1 V2
R1  Info T2 R1 V1   Info T3 R1 V2   Info T4 R1 V3
R1  Info T2 R1 V1   Info T3 R1 V2   Info T4 R1 V4
R1  Info T2 R1 V2   Info T3 R1 V2   Info T4 R1 V1
R1  Info T2 R1 V2   Info T3 R1 V2   Info T4 R1 V2
R1  Info T2 R1 V2   Info T3 R1 V2   Info T4 R1 V3
R1  Info T2 R1 V2   Info T3 R1 V2   Info T4 R1 V4
R1  Info T2 R1 V1   Info T3 R1 V3   Info T4 R1 V1
R1  Info T2 R1 V1   Info T3 R1 V3   Info T4 R1 V2
R1  Info T2 R1 V1   Info T3 R1 V3   Info T4 R1 V3
R1  Info T2 R1 V1   Info T3 R1 V3   Info T4 R1 V4
R1  Info T2 R1 V2   Info T3 R1 V3   Info T4 R1 V1
R1  Info T2 R1 V2   Info T3 R1 V3   Info T4 R1 V2
R1  Info T2 R1 V2   Info T3 R1 V3   Info T4 R1 V3
R1  Info T2 R1 V2   Info T3 R1 V3   Info T4 R1 V4
R2  Info T2 R2 V1
R3                  Info T3 R3 V1
R4                                  Info T4 R4 V1
R5  Info T2 R5 V1   Info T3 R5 V1   Info T4 R5 V1
R5  Info T2 R5 V1   Info T3 R5 V1   Info T4 R5 V2
R6  Info T2 R6 V1   Info T3 R6 V1
R7  Info T2 R7 V1
R8

Output from Query 2

R1  Info T2 R1 V1   Info T3 R1 V1   Info T4 R1 V1
R1  Info T2 R1 V1   Info T3 R1 V1   Info T4 R1 V2
R1  Info T2 R1 V1   Info T3 R1 V1   Info T4 R1 V3
R1  Info T2 R1 V1   Info T3 R1 V1   Info T4 R1 V4
R1  Info T2 R1 V1   Info T3 R1 V2   Info T4 R1 V1
R1  Info T2 R1 V1   Info T3 R1 V2   Info T4 R1 V2
R1  Info T2 R1 V1   Info T3 R1 V2   Info T4 R1 V3
R1  Info T2 R1 V1   Info T3 R1 V2   Info T4 R1 V4
R1  Info T2 R1 V1   Info T3 R1 V3   Info T4 R1 V1
R1  Info T2 R1 V1   Info T3 R1 V3   Info T4 R1 V2
R1  Info T2 R1 V1   Info T3 R1 V3   Info T4 R1 V3
R1  Info T2 R1 V1   Info T3 R1 V3   Info T4 R1 V4
R1  Info T2 R1 V2   Info T3 R1 V1   Info T4 R1 V1
R1  Info T2 R1 V2   Info T3 R1 V1   Info T4 R1 V2
R1  Info T2 R1 V2   Info T3 R1 V1   Info T4 R1 V3
R1  Info T2 R1 V2   Info T3 R1 V1   Info T4 R1 V4
R1  Info T2 R1 V2   Info T3 R1 V2   Info T4 R1 V1
R1  Info T2 R1 V2   Info T3 R1 V2   Info T4 R1 V2
R1  Info T2 R1 V2   Info T3 R1 V2   Info T4 R1 V3
R1  Info T2 R1 V2   Info T3 R1 V2   Info T4 R1 V4
R1  Info T2 R1 V2   Info T3 R1 V3   Info T4 R1 V1
R1  Info T2 R1 V2   Info T3 R1 V3   Info T4 R1 V2
R1  Info T2 R1 V2   Info T3 R1 V3   Info T4 R1 V3
R1  Info T2 R1 V2   Info T3 R1 V3   Info T4 R1 V4
R2  Info T2 R2 V1
R3
R4
R5  Info T2 R5 V1   Info T3 R5 V1   Info T4 R5 V1
R5  Info T2 R5 V1   Info T3 R5 V1   Info T4 R5 V2
R6  Info T2 R6 V1   Info T3 R6 V1
R7  Info T2 R7 V1
R8

OTHER TIPS

Like this :

SELECT col FROM table1 t1 
    LEFT JOIN table2 t2 ON t1.col = t2.col
    -- first join will return each statement of table1 and table2 where t1.col = t2.col or where t1.col not in t2 (with null value in the right side columns)
    LEFT JOIN table3 t3 ON t1.col = t3.col
    -- second join will do the same but with each statement of the result of the previous join as a temporarily table
    LEFT JOIN table4 t4 ON t1.col = t4.col
    -- ...
    WHERE ...

When you use multiple LEFT JOIN like you post you have to interpret this as the possibility of null values on the tables. In your example, for your query to return values only Table1 needs to have records. You could have empty table2, table3 and table4 and your query will return values anyway. Take this query for instance:

SELECT *
  FROM Prices P
 RIGHT JOIN Articles A ON P.ArticleID = A.ArticleID
  LEFT JOIN Sales S    ON S.ArticleID = A.ArticleID

In this case you get your entire Articles Table, and if you have prices or sales for a certain article, you also get that info, but you always get ALL your articles.

Let's re-imagine how your original SQL could be expressed in a sort of pseudo-code. Allow Join(table1_expression,table2_expression,on_condition) to represent a LEFT JOIN operation, the result of which is an unnamed temporary table.

-- Pseudo-code for: FROM table1 t1
LET t1 = table1

-- Pseudo-code for: LEFT JOIN table2 t2 ON t1.col = t2.col
LET t2 = table2
LET S1 = Join(t1, t2, t1.col = t2.col)

Here S1 is a reference to the resulting (unnamed temporary) table that are the rows of a LEFT JOIN. Since you already know what a LEFT JOIN does for two tables, there's no need to go into that here. But it should be said that this S1 has all of the columns of both t1 and t2. So, for the sake of discussion, let's say that in this pseudo-code all t1 columns that go into S1 are in the format of S1.t1.*; and all t2 columns, S1.t2.*. E.g. t1.col and t2.col in S1 are S1.t1.col and S1.t2.col.

-- Pseudo-code for: LEFT JOIN table3 t3 ON t1.col = t3.col
LET t3 = table3
LET S2 = Join(S1, t3, S1.t1.col = t3.col)

Here S1 is treated like any other table. The S1.t1.col is just us interpreting the t1.col of the original SQL to the pseudo-code.

OK, so far the pseudo-code (functionally) reflects what happens in real-life SQL. At first, it starts with t1 (table1) as the result of the query. Then it goes to S1 because of the first LEFT JOIN. From there, the result of the query becomes the rows of S2, the second LEFT JOIN.

-- Pseudo-code for: LEFT JOIN table4 t4 ON t1.col = t4.col
LET t4 = table4
LET S3 = Join(S2, t4, S2.t1.col = t4.col)

The ultimate result of the query changes with every subsequent LEFT JOIN. Just note that every column of each table is in play (i.e. table1 through table4), until, of course, the system processes the SELECT (speaking of which, your example needs to be SELECT t1.col since a column named col is in all your tables). And also the WHERE would filter out rows from S3, which then is the actual result of your example SQL.


It should probably be pointed out that a real SQL database probably wouldn't make a temporary table to do a LEFT JOIN like the pseudo-code did. But, functionally speaking, the rows returned would be the same.

You can left-join as many tables together as you like. See below for an example with three tables. There are no limitations as to onto which column you have to join the third table and every table after that to either.

SELECT a.*, b.*, c.* 
FROM [first_table] as a

LEFT JOIN [second_table] as b
ON a.some_column = b.some_other_column

LEFT JOIN [third_table] as c
ON a.some_second_column = c.yet_another_column
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top