Faster Way to Code this MysQL Query using GROUP BY, HAVING, COUNT, ORDER BY and 2 Tables

https://stackoverflow.com/questions/21460517

05-10-2022
|

Question

I am trying to use this to query 2 tables and get results based on factors from mainly one table. I would prefer doing 1 query instead of 1 query with many sub queries in a while or foreach.

SELECT a.request, a.city 
FROM pages a, TNDB_CSV2 b
WHERE a.main_id = b.PerformerID 
AND b.PCatID = '3' 
AND a.catnum = '303' 
AND a.city = b.City 
AND b.TicketsYN = 'Y' 
AND b.CountryID IN ('38', '217')  
GROUP BY b.PerformerID, b.City HAVING COUNT(*) > 4 
ORDER BY a.name ASC

So basically what this is saying is that I want to get results in 'pages' where records in 'TNDB_CSV2' have at least 4 matches of 'PerformerID' and 'City'.

The query works correctly, the issue is that it takes between 55-67 seconds to run which is massively way too long. Similar queries should take a fraction of a second. I have never grouped by 2 columns using HAVING and COUNT before so I am thinking there might be a much more efficient way of doing this.

The query currently returns 1,011 records and I looked to make sure that the conditions match the results and they do.

Solution

Here is your query, formatted with a proper join clause:

SELECT a.request, a.city 
FROM pages a join
     TNDB_CSV2 b
     on a.main_id = b.PerformerID and a.city = b.City 
WHERE b.PCatID = '3' AND 
      b.TicketsYN = 'Y' AND
      b.CountryID IN ('38', '217')  and
      a.catnum = '303'   
GROUP BY b.PerformerID, b.City
HAVING COUNT(*) > 4 
ORDER BY a.name ASC;

You should be able to improve the performance of this query with indexes. Here are two that I can think of:

pages(catnum, main_id, city, name)
TNDB_CSV2(PerformerID, city, PCatID, TicketsYN, CountryID);

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow