Query returning one "extra" record. Any advice on how to remove it from the query results?

StackOverflow https://stackoverflow.com/questions/23550816

  •  18-07-2023
  •  | 
  •  

Pergunta

We have the following, quite complex (at least for us) query. Since, as far as we know, there's no such thing as INTERSECT on MySQL, we are wondering how can we fix this:

( 
  SELECT GROUP_CONCAT(APA_T.district ORDER BY APA_T.district), t.name 
  FROM tbl_activity AS t 
   INNER JOIN tbl_activity_package AS ap ON t.id = ap.id_activity 
   INNER JOIN ( 
    SELECT DISTINCT apa.district AS district, ( 
     SELECT s1.id_activity_package
     FROM tbl_activity_package_address s1 
     WHERE apa.district = s1.district 
     ORDER BY s1.id DESC 
     LIMIT 1 
   ) AS idActivityPackage 
   FROM 
   tbl_activity_package_address apa 
   ORDER BY apa.district 
  ) AS APA_T 
  ON ap.id = APA_T.idActivityPackage 
  GROUP BY t.name 
  ORDER BY APA_T.district 
) 
UNION DISTINCT 
( 
 SELECT GROUP_CONCAT(DISTINCT apa2.district ORDER BY apa2.district), t2.name 
 FROM tbl_activity AS t2 
 INNER JOIN tbl_activity_package AS ap2 ON t2.id = ap2.id_activity 
 INNER JOIN tbl_activity_package_address AS apa2 ON ap2.id = apa2.id_activity_package 
 GROUP BY t2.name 
 ORDER BY apa2.district 
) 
#LIMIT 6, 6

Here are the results:

GROUP_CONCAT(APA_T.DISTRICT ORDER BY APA_T.DISTRICT)    NAME
Beja,Faro,Setubal                                           activity-1
Evora                                                   activity-2
Sintra                                                  activity-4
Braga,Sines                                                 activity-5
Santarem                                                    activity-6
Guarda,Matosinhos,Sagres                                    activity-7
Lisboa,Montemor,Porto,Rio de Janeiro                    activity-8
Beja,Evora,Faro,Setubal                                 activity-1
Faro                                                    activity-3

Here are the results as we wish they were:

GROUP_CONCAT(APA_T.DISTRICT ORDER BY APA_T.DISTRICT)    NAME
Beja,Faro,Setubal                                           activity-1
Evora                                                   activity-2
Sintra                                                  activity-4
Braga,Sines                                                 activity-5
Santarem                                                    activity-6
Guarda,Matosinhos,Sagres                                    activity-7
Lisboa,Montemor,Porto,Rio de Janeiro                    activity-8
Faro                                                    activity-3

ISSUE

This line should NOT appear. No activity should appear twice.

Beja,Evora,Faro,Setubal                                 activity-1

We understand that the UNION DISTINCT doesn't remove it, because indeed: Beja, Faro, Setubal IS DIFFERENT THAN Beja,Evora,Faro,Setubal HOWEVER, we wish NOT to have Evora to appear on the first result. So, it is OK as it is, the first query on the UNION does it's job as it should.

Still, that second activity-1 that appears, should be removed.

Any advice on how to solve this?

THE BIG PICTURE As you can see, this is quite a huge select that will, perhaps, get worst and slow by time. We wish to have a INFINITE SCROLL of Activities, and the first results of that Infinite Scroll, should be from Activities happening on different districts. Why? Why can't we do it "order by date" or something, you may ask.

Because if a database back-end user do insert the last 20 records, all from one single district, we will have on the infinite scroll first list results, only activities from that district APPEARING that we don't have more than that district.

So, the point is to LIST ALL the results on a certain (complex) ORDER. :) Any other, perhaps better way, would be great.

http://sqlfiddle.com/#!2/37dd94/51

Foi útil?

Solução

Does the below (SQL Fiddle) produce the results you are looking for. I wrapped the union so I could then sort on the name field. If you don't want it that way then you can remove it or sort on the DistCon field instead.

SELECT * FROM 
(
  SELECT GROUP_CONCAT(APA_T.district) AS DistCon, t.name
  FROM tbl_activity AS t 
  JOIN tbl_activity_package AS ap ON t.id = ap.id_activity 
  JOIN 
  (
    SELECT DISTINCT apa.district AS district, 
    (
       SELECT s1.id_activity_package 
       FROM tbl_activity_package_address s1
       WHERE apa.district = s1.district
       ORDER BY s1.id DESC
       LIMIT 1
    ) AS idActivityPackage
    FROM 
    tbl_activity_package_address apa
    ORDER BY apa.district
  ) AS APA_T
  ON ap.id = APA_T.idActivityPackage
  GROUP BY t.name 
  UNION 
  SELECT GROUP_CONCAT(apa.district), t.name
  FROM tbl_activity AS t 
  JOIN tbl_activity_package AS ap ON t.id = ap.id_activity 
  JOIN tbl_activity_package_address AS apa ON ap.id = apa.id_activity_package
  WHERE t.name NOT IN 
  (
    SELECT DISTINCT t.name
    FROM tbl_activity AS t 
    JOIN tbl_activity_package AS ap ON t.id = ap.id_activity 
    JOIN 
    (
      SELECT DISTINCT apa.district AS district, 
      (
         SELECT s1.id_activity_package 
         FROM tbl_activity_package_address s1
         WHERE apa.district = s1.district
         ORDER BY s1.id DESC
         LIMIT 1
      ) AS idActivityPackage
      FROM 
      tbl_activity_package_address apa
    ) AS APA_T
    ON ap.id = APA_T.idActivityPackage
  )
  GROUP BY t.name 
) AS Mm
ORDER BY  Mm.name

Outras dicas

This query provides a slightly different result from that specified above because it employs slightly different rules.

Basically, it says "Give me at least one district for every activity. Where multiple districts offer the same activity, exclude any that are sole providers of another activity."

SELECT x.activity
     , GROUP_CONCAT(DISTINCT x.district) districts
  FROM
     ( SELECT a.name activity
            , apa.district 
         FROM tbl_activity a
         JOIN tbl_activity_package ap
           ON ap.id_activity = a.id
         JOIN tbl_activity_package_address apa
           ON apa.id_activity_package = ap.id
     ) x
  LEFT 
  JOIN
     ( SELECT a.name activity
            , apa.district 
         FROM tbl_activity a
         JOIN tbl_activity_package ap
           ON ap.id_activity = a.id
         JOIN tbl_activity_package_address apa
           ON apa.id_activity_package = ap.id
        GROUP
           BY activity 
       HAVING COUNT(*) = 1
     ) y
    ON y.district = x.district
   AND y.activity <> x.activity
 WHERE y.activity IS NULL
 GROUP
    BY activity;

+------------+--------------------------------------+
| activity   | districts                            |
+------------+--------------------------------------+
| activity-1 | Beja,Setubal                         |
| activity-2 | Evora                                |
| activity-3 | Faro                                 |
| activity-4 | Sintra                               |
| activity-5 | Braga,Sines                          |
| activity-6 | Santarem                             |
| activity-7 | Guarda,Sagres,Matosinhos             |
| activity-8 | Lisboa,Porto,Rio de Janeiro,Montemor |
+------------+--------------------------------------+
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top