Question

I have a large database with many tables. Some of these tables may contain an email field. They are all the same type of varchar(255). What I want to be able to do is join all of the tables with an email column together and get a list of all the unique emails on the database.

The problem I have is there will be multiple tables with that column so I need it to be high performance. The second problem is that afaik, joining tables together means that they must have a common value from the first table to the joined table.

I want to be able to do something like:

SELECT
    email
FROM
    table_1
LEFT JOIN
    table_2
USING (email)
LEFT JOIN
    table_3
USING (email)

However, I want to be able to get all of the emails from all the tables regardless of if they exist in the other table.

Would it be quicker to do a union select on all the tables in a temp table and select distinct from that?

I need to be able to do this over lots of tables (10+) with thousands to millions of rows over them potentially. If needed, I will set the email columns up as an INDEX.

Thanks

Était-ce utile?

La solution

If you want unique emails from all the tables, including emails that only occur in one table, you should use UNION instead of JOIN

SELECT DISTINCT email FROM table_a
UNION SELECT DISTINCT email FROM table_b
UNION SELECT DISTINCT email FROM table_c
...
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top