سؤال

When I run the following queries with SQL_CALC_FOUND_ROWS or specify all the colors RED,BLUE,YELLOW in the sub-query using an IN clause, I get a record count of 198. But when I do each color individually, I get a COUNT that adds up to 248 (72 + 59 + 118). Where are the extra 50 rows coming from? I would have thought that if I did them individually and then added them up, I would get 198, but I am getting 248. Am I viewing how the IN clause works incorrectly? I would prefer to use COUNT instead of SQL_CALC_FOUND_ROWS because I have found SQL_CALC_FOUND_ROWS to be much slower (+3 seconds)

Here are few stats:

1. I am running MySQL Version: 5.5.2 
2. The tables engine is INNODB.
3. All the `CHECK TABLE .... EXTENDED` return OK.

-- Returns 198
SELECT SQL_CALC_FOUND_ROWS DISTINCT p.* FROM Persons AS p
INNER JOIN Colors AS c ON c.PersonId = p.Id
WHERE p.Name = 'John Doe' AND c.ColorName IN ('YELLOW','RED','BLUE');
SELECT FOUND_ROWS();

-- Returns 198
SELECT COUNT(1) as MyCount FROM (SELECT DISTINCT p.* FROM Persons AS p
INNER JOIN Colors AS c ON c.PersonId = p.Id
WHERE p.Name = 'John Doe' AND c.ColorName IN ('YELLOW','RED','BLUE')) all;

-- Returns 72
SELECT COUNT(1) as MyCount FROM (SELECT DISTINCT p.* FROM Persons AS p
INNER JOIN Colors AS c ON c.PersonId = p.Id
WHERE p.Name = 'John Doe' AND c.ColorName IN ('RED')) red;

-- Returns 59
SELECT COUNT(1) as MyCount FROM (SELECT DISTINCT p.* FROM Persons AS p
INNER JOIN Colors AS c ON c.PersonId = p.Id
WHERE p.Name = 'John Doe' AND c.ColorName IN ('BLUE')) blue;

-- Returns 118
SELECT COUNT(1) as MyCount FROM (SELECT DISTINCT p.* FROM Persons AS p
INNER JOIN Colors AS c ON c.PersonId = p.Id
WHERE p.Name = 'John Doe' AND c.ColorName IN ('YELLOW')) yellow;
هل كانت مفيدة؟

المحلول

Something like this is happening:

Persons table

ID  Name
--  ------------
 1  John Doe
 2  Xaisoft

Colors table

PersonID  ColorName
--------  ---------
       1  Red
       1  Yellow
       2  Yellow

Now query Persons.* for John Doe while joining to Colors:

 SELECT p.*
 FROM Persons p
 INNER JOIN Colors C ON c.PersonID = p.ID
 WHERE p.Name = 'John Doe';

Your result will be one p.* row for each color under John Doe:

ID  Name
--  ------------
 1  John Doe
 1  John Doe

If you apply a COUNT DISTINCT against this you'll end up with 1.

If you count Red and Yellow separately, you'll get one for each query. Add them up and you'll get 2.


Look at it this way: if you throw out the COUNT and just SELECT DISTINCT Persons.* you get these results:

Red or Yellow for John Doe:

 SELECT DISTINCT p.*
 FROM Persons p
 INNER JOIN Colors C ON c.PersonID = p.ID
 WHERE p.Name = 'John Doe'
   AND c.ColorName IN ('RED', 'YELLOW');

ID  Name
--  ----------
 1  John Doe

Red only for John Doe:

 SELECT DISTINCT p.*
 FROM Persons p
 INNER JOIN Colors C ON c.PersonID = p.ID
 WHERE p.Name = 'John Doe'
   AND c.ColorName IN ('RED');

ID  Name
--  ----------
 1  John Doe

Yellow only for John Doe:

 SELECT DISTINCT p.*
 FROM Persons p
 INNER JOIN Colors C ON c.PersonID = p.ID
 WHERE p.Name = 'John Doe'
   AND c.ColorName IN ('YELLOW');

ID  Name
--  ----------
 1  John Doe
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top