Question

I'm having a problem getting a query to work, which I think should work. It's in the form

SELECT DISTINCT a, b, c FROM t1 WHERE NOT IN ( SELECT DISTINCT a,b,c FROM t2 ) AS alias

But mysql chokes where "IN (" starts. Does mysql support this syntax? If not, how can I go about getting these results? I want to find distinct tuples of (a,b,c) in table 1 that don't exist in table 2.

Was it helpful?

Solution

You should use not exists:

SELECT DISTINCT a, b, c FROM t1 WHERE NOT EXISTS (SELECT NULL FROM t2 WHERE t1.a = t2.a AND t1.b = t2.b AND t1.c = t2.c)

Using NOT IN is not the best method to do this, even if you check only one key. The reason is that if you use NOT EXISTS the DBMS will only have to check indices if indices exist for the needed columns, where as for NOT IN it will have to read the actual data and create a full result set that subsequently needs to be checked.

Using a LEFT JOIN and then checking for NULL is also a bad idea, it will be painfully slow when the tables are big since the query needs to make the whole join, reading both tables fully and subsequently throw away a lot of it. Also, if the columns allow for NULL values checking for NULL will report false positives.

OTHER TIPS

I had trouble figuring out the right way to execute this query, even with the answers provided; then I found the MySQL documentation reference I needed:

SELECT DISTINCT store_type
FROM stores 
WHERE NOT EXISTS (SELECT * FROM cities_stores WHERE cities_stores.store_type = stores.store_type);

The trick I had to wrap my brain around was using the reference to the 'stores' table from the first query inside the subquery. Hope this helps (or helps others, since this is an old thread.)

From http://dev.mysql.com/doc/refman/5.0/en/exists-and-not-exists-subqueries.html

SELECT DISTINCT t1.* FROM t1 LEFT JOIN t2 ON (t1.a = t2.a AND t1.b = t2.b AND t1.c = t2.c) WHERE t2.a IS NULL

As far as I know, NOT IN can only be used for 1 field at a time. And the field has to be specified in between "WHERE" and "NOT IN".

(Edit:) Try using a NOT EXISTS:

SELECT a, b, c 
FROM t1 
WHERE NOT EXISTS 
   (SELECT * 
   FROM t2 
   WHERE t1.a = t2.a AND t1.b = t2.b AND t1.c = t2.c)

In addition, an inner join on a, b, and c being equal should give you all non-DISTINCT tuples, while a LEFT JOIN with a WHERE IS NULL clause should give you the DISTINCT ones, as Charles mentioned below.

Well, I'm going to answer my own question, in spite of all the great advice others gave.

Here's the proper syntax for what I was trying to do.

SELECT DISTINCT a, b, c FROM t1 WHERE (a,b,c) NOT IN ( SELECT DISTINCT a,b,c FROM t2 )

Can't vouch for the efficiency of it, but the broader questions I was implicitly putting was "How do I express this thought in SQL", not "How do I get a particular result set". I know that's unfair to everyone who took a stab, sorry!

Need to add a column list after the WHERE clause and REMOVE the alias.

I tested this with a similar table and it is working.

SELECT DISTINCT a, b, c 
FROM t1 WHERE (a,b,c)
NOT IN (SELECT DISTINCT a,b,c FROM t2)

Using the mysql world db:

-- dont include city 1, 2
SELECT DISTINCT id, name FROM city 
WHERE (id, name) 
NOT IN (SELECT id, name FROM city  WHERE ID IN (1,2))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top