Question

I have 3 tables:

file_list => list of all files on web server  
stats_web => number of pageviews from public web  
stats_reg => number of pageviews from "registered users" 

I have the schema and some sample data on http://sqlfiddle.com/#!2/98c6a/1/0

I am trying to calculate the date that a file was first hit either by the public or by a registered user.

Current SQL looks like this:

SELECT
    list.path,
    list.mod_date,
    IF(MIN(ifnull(web.hit_date, 0000-00-00)) < MIN(ifnull(reg.hit_date, 0000-00-00)), MIN(ifnull(web.hit_date, 0000-00-00)), MIN(ifnull(reg.hit_date, 0000-00-00))) AS 'min_date',
    IF(MAX(ifnull(web.hit_date, 0000-00-00)) > MAX(ifnull(reg.hit_date, 0000-00-00)), MAX(ifnull(web.hit_date, 0000-00-00)), MAX(ifnull(reg.hit_date, 0000-00-00))) AS 'max_date',    
    SUM(ifnull(web.pages, 0)) + SUM(ifnull(reg.pages, 0)) AS 'page_views'
FROM
    file_list list
    LEFT JOIN
         stats_web web ON list.path = web.path
    LEFT JOIN
         stats_reg reg ON list.path = reg.path
WHERE
list.path LIKE '/web/htdocs/%'
GROUP BY list.path;

The problem is that if a record appears in only one of the stats tables the min date is always 0. This is because of the ifnull() on the MIN and MAX but if I don't use ifnull() then both min and max dates return NULL.

Was it helpful?

Solution

You should use a date in the distant future for the second argument of ifnull when it is used to compare the minimum. Otherwise, the first null would make 0000-00-00 the minimum, and since it's the earliest possible date, no other date would be able to beat it to become the minimum.

If you change the forth line of your query to

IF(MIN(ifnull(web.hit_date, 9999-12-31)) < MIN(ifnull(reg.hit_date, 9999-12-31)), MIN(ifnull(web.hit_date, 9999-12-31)), MIN(ifnull(reg.hit_date, 9999-12-31))) AS 'min_date'

you should get correct results.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top