Question

I’m currently using the following query for jsPerf. In the likely case you don’t know jsPerf — there are two tables: pages containing the test cases / revisions, and tests containing the code snippets for the tests inside the test cases.

There are currently 937 records in pages and 3817 records in tests.

As you can see, it takes quite a while to load the “Browse jsPerf” page where this query is used.

The query takes about 7 seconds to execute:

SELECT
 id AS pID,
 slug AS url,
 revision,
 title,
 published,
 updated,
 (
  SELECT COUNT(*)
  FROM pages
  WHERE slug = url
  AND visible = "y"
 ) AS revisionCount,
 (
  SELECT COUNT(*)
  FROM tests
  WHERE pageID = pID
 ) AS testCount
 FROM pages
 WHERE updated IN (
  SELECT MAX(updated)
  FROM pages
  WHERE visible = "y"
  GROUP BY slug
 )
 AND visible = "y"
 ORDER BY updated DESC

I’ve added indexes on all fields that appear in WHERE clauses. Should I add more?

How can this query be optimized?

P.S. I know I could implement a caching system in PHP — I probably will, so please don’t tell me :) I’d just really like to find out how this query could be improved, too.

Was it helpful?

Solution

Use:

   SELECT x.id AS pID,
          x.slug AS url,
          x.revision,
          x.title,
          x.published,
          x.updated,
          y.revisionCount,
          COALESCE(z.testCount, 0) AS testCount
     FROM pages x
     JOIN (SELECT p.slug,
                  MAX(p.updated) AS max_updated,
                  COUNT(*) AS revisionCount
             FROM pages p
            WHERE p.visible = 'y'
         GROUP BY p.slug) y ON y.slug = x.slug
                           AND y.max_updated = x.updated
LEFT JOIN (SELECT t.pageid,
                  COUNT(*) AS testCount
             FROM tests t
         GROUP BY t.pageid) z ON z.pageid = x.id
 ORDER BY updated DESC

OTHER TIPS

You want to learn how to use EXPLAIN. This will execute the sql statement, and show you which indexes are being used, and what row scans are being performed. The goal is to reduce the number of row scans (ie, the database searching row by row for values).

You may want to try the subqueries one at a time to see which one is slowest.

This query:

SELECT MAX(updated)
  FROM pages
  WHERE visible = "y"
  GROUP BY slug

Makes it sort the result by slug. This is probably slow.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top