Question

Sorry if the title is a little... Crappy. Basically I'm writing a small forum and using multiple sub-queries to select the number of threads, number of posts, and the date of the last post in a forum while grabbing the forum's information at the same time to display on the main page!

This is my query, since I suck at explaining things:

SELECT `f`.*,
    (SELECT COUNT(`id`)
    FROM `forum_threads` 
    WHERE `forumId1` = `f`.`id1`
        AND `forumId2` = `f`.`id2`) AS `threadCount`,
    (SELECT COUNT(`id`)
    FROM `forum_posts` 
    WHERE `forumId1` = `f`.`id1`
        AND `forumId2` = `f`.`id2`) AS `postCount`,
    (SELECT `date`
    FROM `forum_posts` 
    WHERE `forumId1` = `f`.`id1` 
        AND `forumId2` = `f`.`id2` 
        ORDER BY `date` DESC LIMIT 1) AS `lastPostDate`
FROM `forum_forums` AS `f`
ORDER BY `f`.`position` ASC, `f`.`id1` ASC;

And am using the general foreach loop to display the results:

foreach($forums AS $forum) {
    echo $forum->name .'<br />';
    echo $forum->threadCount .'<br />';
    echo $forum->postCount .'<br />';
    echo $forum->lastPostDate .'<br />';
}

(Not exactly like that of course, but for the sake of explaining...)

Now I was wondering if that would be "bad" for performance, or if there was any better way of doing it? Assuming there are quite a few posts and threads in each forum.

I was originally storing "posts", "threads", and "lastPost" columns in the forum table itself, and was going to increment (posts = posts + 1) the values every time someone created a new thread or post. Though I had this idea as well and was wondering if it was any good. :P

Was it helpful?

Solution

I would do things a bit differently:

It seems to me that all these three fields: threadCount, postCount and lastPostDate are fields that you can maintain on a separate table, say forum_stats which will hold only 4 columns:
* forum_id
* thread_count
* post_count
* last_post_date

These columns can be updated via. trigger upon insert/update.
If you'll pay this small overhead during the update operations - you'll get a very fast query for the select (and it will remain very fast regardless the number of forums/posts/threads you have).

Another approach (not us good TMO):
Create the stats table and run a daily (or every few hours) a batch-job which will update the stats. The price is that the data you display will never be up-to-date, and the job might require resources, you might want to run the job only at night, for example, since it's heavy and you don't want it to effect the majority of your website visitors.

OTHER TIPS

Usually this kind of thing is terrible from a performance perspective and you'd be better off with counter columns that you can fetch from a single row. Keeping these in sync can be annoying, but there's no retrieval cost once they're in there.

You've identified the data you're retrieving, so what you need to do next is figure out how to put that data in there in the first place. @alfasin's answer describes an example schema, and while putting it in a separate table is one idea, there's usually not too much in the way of trouble just putting them in the main one. If you're worried about locking, update in smaller batches.

One approach is to write a TRIGGER that updates the counters as records are added and removed from the various tables. This tends to hide a lot of the complexity which can be a bad thing if the logic changes often and people need to be aware of how the system works.

A simple method is to just fiddle with the columns using an additional query after you've created or removed something that would have updated them. For instance, adjusting the last-posted-date is trivial if you do it at the time a post is created.

If these counters get a bit screwy, and they will eventually, you need a method to bring them back into sync. An easy way is to write a VIEW that produces the same results your query does now, perhaps re-written to use LEFT JOIN instead, and then UPDATE against that if that's possible. This may involve using a temporary table if MySQL can't cope with updating a table with a view of itself, but that's usually not a big deal.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top