Question

Due to faulty hardware, statistics generated over a 2 week period were significantly higher than normal (10000 times higher than normal).

After moving the application to a new server, the problem rectified itself. The issue I have is that there are 2 weeks of stats that are clearly wrong.

I have checked the raw impressions table for the affected fortnight and it seems to be correct (ie. stats per banner per day match the average for the previous month). Looking at the intermediate & summary impressions tables, the values are inflated.

I understand from the openx forum (link text) it's possible to regenerate stats from the raw data but it will only regenerate stats per hour, meaning regenerating stats for 2 weeks would be very time consuming.

Is there another, more efficient way to regenerate the stats from the raw data for the affected fortnight?

Was it helpful?

Solution

Have a look at this link as it appears to have a solution you may find helpful. The solution is similar to the one you posted in your question, but it appears that this one has been modified to make it easier to use. Other than using regenerateAdServerStatistics.php, I do not know of another option for regenerating the statistics you need.

OTHER TIPS

I understand from the openx forum (link text) it's possible to regenerate stats from the raw data but it will only regenerate stats per hour, meaning regenerating stats for 2 weeks would be very time consuming

We have solved this problem on our installation by creating a wrapper shell script for reganerateAdServerStatistics.php with dateStart & dateEnd arguments for situations like the one you mention. It's used to:

  1. regenerate statistics for a specific day (all hours, takes ~2h)
  2. run normal maintenance to keep today's stats updated
  3. goto step (1) as long as day processed < dateEnd

To be honest the script is somewhat more complex, as we also need to import raw data from our data warehouse for each day to be processed, because the "live" data are kept in a in-memory database, but that's kinda out of this post's context.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top