Question

I'm running a cron task which makes lots of queries to a MySQL server. The big issue is that the server runs extremely slowly sometimes.

I've got a relatively big query with 4 tables left joined between them, and 4 smaller queries with natural joins that also attack the first table. After throwing those queries, I then process the results and group them using PHP.

What I'm planning is to somehow mix those 5 queries into just one big query, and then letting PHP do some quick sort()s when I need to do so.

I'm also told MySQL queries run faster than PHP in terms of filtering and sorting, but I'm reasonably worried about it when talking about having 7 or 8 left joins. Some more specs about these queries (which I can't copy because of company policies):

  • Every fetched row and field will be visited at least once.
  • Every query runs based on a single main table, with some "wing" tables.
  • Every query uses the same GROUP BY rule.
  • Currently the PHP code splits some secondary queries results into multiple arrays. It should, if using a big query, also sort the results by multiple parameters.

So, due to these problems, and maybe as a rule of thumb:

What is faster, a big joined query with more PHP or multiple small selects with less PHP?

Was it helpful?

Solution

As a rule of thumb, the less queries the better. There is an overhead for passing a query to MySQL, however complex the query is. However php is surprisingly fast for some things, and if you are not using indexes for a sort (which sounds possible if you are effectively sorting the results of several queries unioned together) the performance of a sort in php might well be comparable or even better.

Where there is a big difference is where you get the results of one query and then perform another query for each returned row on the first query. It is quite easy in this situation for the number of queries to get out of hand rather rapidly without being noticed. At work I found a menu generation script that had one query to get the high level menu items and then another query for each high level item to get the child menu items. This was easily rewritten as a join, but the surprising part is the performance difference, with the time taken to generate the menu dropping from 0.2 seconds to 0.002 seconds.

But it is a case by case decision. I had a requirement to return some values based on a levenshtein calculated value (essentially a score of how different 2 strings are). Using a mysql custom function this was possible and greatly reduced the number of rows returned but was quite slow. The php levenshtein function is massively faster, and it proved to be more efficient to return several times as many rows and then process them in php to get the levenshtein value and then drop the no longer required records.

In the situation you describe I suspect the difference might be marginal. It would appear you will only be doing 4 queries rather than 1 more complex query. However without seeing the table structures and queries (which unfortunately you can't provide) it is difficult to be certain. It might well be efficient to do a single reasonable complex query but ignore the sorting where not strictly necessary, and then perform that in php (usort with a user defined comparison can be useful for this).

There is the further issue that a complex query is more difficult to maintain. While there are plenty of people who can nail a php script together or who can understand a simple SQL query, the number who can understand complex SQL queries is worryingly small.

OTHER TIPS

From my experience SQL queries are faster. I also use many tables in some of my apps and found that that using simple queries and collecting sets of data in PHP is slower and the performance is really improved if you put everything on SQL side.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top