Semi-distinct union

https://stackoverflow.com/questions/20130588

03-08-2022
|

Question

There are UNION and UNION ALL operators in SQL. The first one drops all the duplicates, the second one doesn't. I want to drop only duplicates which originated in different subqueries, but keep those which come from the same one. Example:

TABLE t1:     TABLE t2:     TABLE t3:
a | b         a | b         a | b
—————         —————         —————
1 | 2         1 | 2         1 | 3
1 | 2                       1 | 3

I want (SELECT * FROM t1) UNION SEMI (SELECT * FROM t3) to return all four rows,
then (SELECT * FROM t2) UNION SEMI (SELECT * FROM t2) to return one row.
I don't really care what
(SELECT * FROM t1) UNION SEMI (SELECT * FROM t2) would return, but it would be nice if that somehow depended one the order of subqueries, e. g. in the last example it would be two rows, and in reversed (t2 UNION t1) — one.

I can do it with a huge query, but the question is — is there a standard method for such operation?
Thanks in advance.

Solution

You want a union of table1 with all the rows in table2 that don't have matching rows in table1

SELECT *
FROM table1

UNION ALL

SELECT t2.*
FROM table2 t2
LEFT JOIN table1 t1 USING (a, b, ...) -- list all the columns here
WHERE t1.a IS NULL

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow