Question

Here is the issue.

On a site I've recently taken over it tracks "miles" you ran in a day. So a user can log into the site, add that they ran 5 miles. This is then added to the database.

At the end of the day, around 1am, a service runs which calculates all the miles, all the users ran in the day and outputs a text file to App_Data. That text file is then displayed in flash on the home page.

I think this is kind of ridiculous. I was told they had to do this due to massive performance issues. They won't tell me exactly how they were doing it before or what the major performance issue was.

So what approach would you guys take? The first thing that popped into my mind was a web service which gets the data via an AJAX call. Perhaps every time a new "mile" entry is added, a trigger is fired and updates the "GlobalMiles" table.

I'd appreciate any info or tips on this.

Thanks so much!

Was it helpful?

Solution

Answering this question is a bit difficult since there we don't know all of your requirements and something didn't work before. So here are some different ideas.

First, revisit your assumptions. Generating a static report once a day is a perfectly valid solution if all you need is daily reports. Why hit the database multiple times throghout the day if all that's needed is a snapshot (for instance, lots of blog software used to write html files when a blog was posted rather than serving up the entry from the database each time -- many still do as an optimization). Is the "real-time" feature something you are adding?

I wouldn't jump to AJAX right away. Use the same input method, just move the report from static to dynamic. Doing too much at once is a good way to get yourself buried. When changing existing code I try to find areas that I can change in isolation wih the least amount of impact to the rest of the application. Then once you have the dynamic report then you can add AJAX (and please use progressive enhancement).

As for the dynamic report itself you have a few options.

Of course you can just SELECT SUM(), but it sounds like that would cause the performance problems if each user has a large number of entries.

If your database supports it, I would look at using an indexed view (sometimes called a materialized view). It should support allows fast updates to the real-time sum data:

CREATE VIEW vw_Miles WITH SCHEMABINDING AS 
SELECT SUM([Count]) AS TotalMiles, 
COUNT_BIG(*) AS [EntryCount],
UserId
FROM Miles
GROUP BY UserID
GO
CREATE UNIQUE CLUSTERED INDEX ix_Miles ON vw_Miles(UserId)

If the overhead of that is too much, @jn29098's solution is a good once. Roll it up using a scheduled task. If there are a lot of entries for each user, you could only add the delta from the last time the task was run.

UPDATE GlobalMiles SET [TotalMiles] = [TotalMiles] + 
  (SELECT SUM([Count]) 
    FROM Miles 
    WHERE UserId = @id 
      AND EntryDate > @lastTaskRun
    GROUP BY UserId)
WHERE UserId = @id

If you don't care about storing the individual entries but only the total you can update the count on the fly:

UPDATE Miles SET [Count] = [Count] + @newCount WHERE UserId = @id

You could use this method in conjunction with the SPROC that adds the entry and have both worlds.

Finally, your trigger method would work as well. It's an alternative to the indexed view where you do the update yourself on a table instad of SQL doing it automatically. It's also similar to the previous option where you move the global update out of the sproc and into a trigger.

The last three options make it more difficult to handle the situation when an entry is removed, although if that's not a feature of your application then you may not need to worry about that.

Now that you've got materialized, real-time data in your database now you can dynamically generate your report. Then you can add fancy with AJAX.

OTHER TIPS

If they are truely having performance issues due to to many hits on the database then I suggest that you take all the input and cram it into a message queue (MSMQ). Then you can have a service on the other end that picks up the messages and does a bulk insert of the data. This way you have fewer db hits. Then you can output to the text file on the update too.

I would create a summary table that's rolled up once/hour or nightly which calculates total miles run. For individual requests you could pull from the nightly summary table plus any additional logged miles for the period between the last rollup calculation and when the user views the page to get the total for that user.

How many users are you talking about and how many log records per day?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top